debian/0000755000000000000000000000000012260574561007176 5ustar debian/readseq.10000644000000000000000000001107311652531702010700 0ustar .TH READSEQ 1 .\" NAME should be all caps, SECTION should be 1-8, maybe w/ subsection .\" other parms are allowed: see man(7), man(1) .SH NAME readseq \- Reads and writes nucleic/protein sequences in various formats .SH SYNOPSIS .B readseq .I "[-options] in.seq > out.seq" .SH "DESCRIPTION" This manual page documents briefly the .BR readseq command. This manual page was written for the Debian GNU/Linux distribution because the original program does not have a manual page. Instead, it has documentation in text form, see below. .PP .B readseq reads and writes biosequences (nucleic/protein) in various formats. Data files may have multiple sequences. .B readseq is particularly useful as it automatically detects many sequence formats, and interconverts among them. .SH FORMATS .TP Formats which readseq currently understands: .PD 0 .TP * IG/Stanford, used by Intelligenetics and others .TP * GenBank/GB, genbank flatfile format .TP * NBRF format .TP * EMBL, EMBL flatfile format .TP * GCG, single sequence format of GCG software .TP * DNAStrider, for common Mac program .TP * Fitch format, limited use .TP * Pearson/Fasta, a common format used by Fasta programs and others .TP * Zuker format, limited use. Input only. .TP * Olsen, format printed by Olsen VMS sequence editor. Input only. .TP * Phylip3.2, sequential format for Phylip programs .TP * Phylip, interleaved format for Phylip programs (v3.3, v3.4) .TP * Plain/Raw, sequence data only (no name, document, numbering) .TP + MSF multi sequence format used by GCG software .TP + PAUP's multiple sequence (NEXUS) format .TP + PIR/CODATA format used by PIR .TP + ASN.1 format used by NCBI .TP + Pretty print with various options for nice looking output. Output only. .TP + LinAll format, limited use (LinAll and ConStruct programs) .TP + Vienna format used by ViennaRNA programs .PD 1 .TP See the included "Formats" file for detail on file formats. .SH OPTIONS .PD 0 .TP .B \-help Show summary of options. .TP .B \-a[ll] Select All sequences .TP .B \-c[aselower] Change to lower case .TP .B \-C[ASEUPPER] Change to UPPER CASE .TP .B \-degap[=-] Remove gap symbols .TP .B \-i[tem=2,3,4] Select Item number(s) from several .TP .B \-l[ist] List sequences only .TP .B \-o[utput=]out.seq Redirect Output .TP .B \-p[ipe] Pipe (command line, stdout) .TP .B \-r[everse] Change to Reverse-complement .TP .B \-v[erbose] Verbose progress .TP .B \-f[ormat=]# Format number for output, or \-f[ormat=]Name Format name for output: 1. IG/Stanford 11. Phylip3.2 2. GenBank/GB 12. Phylip 3. NBRF 13. Plain/Raw 4. EMBL 14. PIR/CODATA 5. GCG 15. MSF 6. DNAStrider 16. ASN.1 7. Fitch 17. PAUP/NEXUS 8. Pearson/Fasta 18. Pretty (out-only) 9. Zuker (in-only) 19. LinAll 10. Olsen (in-only) 20. Vienna Pretty format options: .TP .B \-wid[th]=# Sequence line width .TP .B \-tab=# Left indent .TP .B \-col[space]=# Column space within sequence line on output .TP .B \-gap[count] Count gap chars in sequence numbers .TP .B \-nameleft, -nameright[=#] Name on left/right side [=max width] .TP .B \-nametop Name at top/bottom .TP .B \-numleft, -numright Seq index on left/right side .TP .B \-numtop, -numbot Index on top/bottom .TP .B \-match[=.] Use match base for 2..n species .TP .B \-inter[line=#] Blank line(s) between sequence blocks .PD 1 .SH EXAMPLES .PD 0 .TP readseq -- for interactive use .TP readseq my.1st.seq my.2nd.seq \-all \-format=genbank \-output=my.gb -- convert all of two input files to one genbank format output file .TP readseq my.seq \-all \-form=pretty \-nameleft=3 \-numleft \-numright \-numtop \-match -- output to standard output a file in a pretty format .TP readseq my.seq \-item=9,8,3,2 \-degap \-CASE \-rev \-f=msf \-out=my.rev -- select 4 items from input, degap, reverse, and uppercase them .TP cat *.seq | readseq \-pipe \-all \-format=asn > bunch-of.asn -- pipe a bunch of data thru readseq, converting all to asn .PD 1 .SH "SEE ALSO" The programs are documented fully in text form. See the files in .I /usr/share/doc/readseq .SH AUTHOR This manual page was written by Stephane Bortzmeyer , for the Debian GNU/Linux system (but may be used by others). debian/copyright0000644000000000000000000000174212242415310011117 0ustar Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: Readseq Source: http://iubio.bio.indiana.edu/soft/molbio/readseq/version1/src/ Files: * Copyright: 1990-1993 D. Gilbert biology dept., indiana university License: PD * This program may be freely copied and used by anyone. * Developers are encourged to incorporate parts in their * programs, rather than devise their own private sequence * format. . From gilbertd@bio.indiana.edu Tue Oct 30 08:18:23 2001 Date: Mon, 29 Oct 2001 13:07:26 -0500 (EST) From: Don Gilbert Subject: Re: [ITA] Intent To Adopt Readseq . Andreas, . Please feel free to use readseq code as you like; I've placed it in the public domain. . - Don Files: debian/* Copyright: © 2001-2013 Andreas Tille License: GPLv2+ The Debian packaging is licensed under the GPL which is available at `/usr/share/common-licenses/GPL'. debian/manpages0000644000000000000000000000002111652531702010677 0ustar debian/readseq.1 debian/dirs0000644000000000000000000000004411652535227010060 0ustar usr/bin usr/share/doc/readseq/tests debian/README.test0000644000000000000000000000050312242421100011006 0ustar You can test this package by cd `mktemp -d` cp -a /usr/share/doc/readseq/tests/* . make test Note: It is known that not all tests are passing but it is not known whether the tests itself are correctly implemented Feel free to use `reportbug readseq` if you have some idea how to fix the tests. debian/changelog0000644000000000000000000001347712260574561011064 0ustar readseq (1-11) unstable; urgency=medium * debian/patches/buffer_overflow.patch: Fix buffer overflow (thanks to Michael Bienia for the patch) Closes: #733650 -- Andreas Tille Tue, 31 Dec 2013 15:38:41 +0100 readseq (1-10) unstable; urgency=low * debian/upstream: citation information * debian/control: - cme fix dpkg-control - debhelper 9 - canonical Vcs URLs * debian/README.source: removed because redundant * debian/patches/enable_tests.patch: Fix test target * debian/patches/hardening.patch: Propagate hardening options * debian/copyright: DEP5 * provide test after installing the package -- Andreas Tille Mon, 18 Nov 2013 14:22:50 +0100 readseq (1-9) unstable; urgency=low * Remove [Biology] from description * Standards-Version: 3.9.2 (no changes needed) * debian/source/format: 3.0 (quilt) * debian/get-orig-source: Make sure repackaging is binary reproducible * Fixed Vcs fields * Debhelper 8 (control+compat) * debian/rules: switch to short dh syntax * debian/patches/gcc-4.6_format-security.patch: Fix format-security issue Closes: #643465 * debian/dirs: Create dirs used by dh_auto_install -- Andreas Tille Fri, 28 Oct 2011 16:15:21 +0200 readseq (1-8) unstable; urgency=low [ Charles Plessy ] * DM-Upload-Allowed: Yes [ Andreas Tille ] * Added Vcs-Browser, Vcs-Svn * Standards-Version: 3.8.3 (added README.source) * Debhelper 7 * Switch to quilt * debian/patches/552830.patch: Patch from Ruben Molina to fix build problem (thanks to Ruben), changed one last remaining of getline myself Closes: #552830 -- Andreas Tille Thu, 26 Nov 2009 12:31:49 +0100 readseq (1-7) unstable; urgency=low * Group maintainance by Debian-Med packaging group * Build-Depends: debhelper (>= 5.0) * Standards-Version: 3.7.2 (no changes necessary) * Switched to dpatch, cdbs * Added get-orig-source target * Moved manpage to Debian directory * debian/control: Homepage tag -- Andreas Tille Sat, 17 Nov 2007 23:04:00 +0100 readseq (1-6) unstable; urgency=low * Fixed amd64 bug (segfaults when command line options given), thanks to a hint by the ncbi6 maintainer. Closes: #269643. -- Michael Schmitz Tue, 26 Oct 2004 18:12:15 +0100 readseq (1-5) unstable; urgency=low * Fixed bug in format detection that did result in readseq incorrectly claiming LinAll format files whose title begins with a number as Phylip format. Bug spotted locally and fixed by Gerhard Steger . -- Michael Schmitz Thu, 12 Jun 2003 17:55:15 +0100 readseq (1-4) unstable; urgency=low * New Maintainer (took over from Andreas Tille 'cuz readseq is actually in use at our site). * Added LinAll and Vienna RNA formats. -- Michael Schmitz Thu, 07 Nov 2002 16:55:05 +0100 readseq (1-3) unstable; urgency=low * Corrected spelling bug (Thanks to Matt Zimmerman ) closes: #125305 -- Andreas Tille Tue, 18 Dec 2001 12:03:25 +0100 readseq (1-2) unstable; urgency=low * compile explicitely with -lncbiid1 and -lnetcli according to a hint of ncbi-tools6 maintainer Aaron M. Ucko (Thanks Aaron) closes: #119586 -- Andreas Tille Thu, 15 Nov 2001 17:01:18 +0100 readseq (1-1) unstable; urgency=low * New Maintainer closes: #100257 * New upstream release * Applied Bugfix 20Apr93 * Applied fixes from former versions * Added debian/translate.patch to the sources for those brave people who want to test it (please report if I should apply it to the binary) * Replaced a gets call by fgets to avoid buffer overflow * increased p.namewidth from 8 to 10 in ureadseq.h because Arb need this * Added patch by o. strunk (ARB) to allow numbers in genbank sequences to ureadseq.c * Added an undocumented ARB patch which seems to fix a problem with older Phylip versions. * Added a further patch from ARB. Patches can be undone by just undefining ARB. * I did not apply the patch of the ARB version which wrapped isdigit by by rs_isdigit in readseq.c. * Declared in README.Debian that this is the packaged version 1 of readseq by certain reasons. Because I really need this version I close the "new version available" bug. If somebody really wants the new version please file an RFP bug against wnpp. Make sure to read /usr/share/doc/readseq/README.Debian before. closes: #43372 * Added URL to the package description because I consider this as "good style" to have an upstream link without installing the package. * Install Readseq.help as NCBI Vibrant Toolkit compatible helpfile instead of just putting it just into /usr/share/doc/readseq. -- Andreas Tille Mon, 5 Nov 2001 08:43:12 +0100 readseq (0.0-4) unstable; urgency=low * Maintainer set to Debian QA Group . -- Adrian Bunk Fri, 24 Aug 2001 23:51:22 +0200 readseq (0.0-3) unstable; urgency=low * Adopted by new maintainer; closes: #92801 * Updated to latest standards version and added Build-Depends; closes: #84541, #91036, #91640 * Corrected doc path in Makefile and manpage. * Moved from section misc to science. -- Dr. Guenter Bechly Thu, 19 Apr 2001 20:05:45 +0200 readseq (0.0-2) unstable; urgency=low * Manual page. Closes #33772 * Now linked with the NCBI libraries an therefore depends on ncbi-tools. -- Stephane Bortzmeyer Sat, 6 Mar 1999 22:44:04 +0100 readseq (0.0-1) unstable; urgency=low * Initial Release. -- Stephane Bortzmeyer Fri, 5 Feb 1999 16:58:28 +0100 debian/tests/0000755000000000000000000000000012242420610010321 5ustar debian/tests/Makefile0000644000000000000000000000373012242417777012010 0ustar # # Makefile for testing readseq # to use, command me: # % make test # test: @echo "" @echo "Test for general read/write of all chars:" /usr/bin/readseq -p alphabet.std -otest.alpha -diff test.alpha alphabet.std @echo "" @echo "Test for valid format conversions:" /usr/bin/readseq -v -p -f=ig nucleic.std -otest.ig /usr/bin/readseq -v -p -f=gb test.ig -otest.gb /usr/bin/readseq -v -p -f=nbrf test.gb -otest.nbrf /usr/bin/readseq -v -p -f=embl test.nbrf -otest.embl /usr/bin/readseq -v -p -f=gcg test.embl -otest.gcg /usr/bin/readseq -v -p -f=strider test.gcg -otest.strider /usr/bin/readseq -v -p -f=fitch test.strider -otest.fitch /usr/bin/readseq -v -p -f=fasta test.fitch -otest.fasta /usr/bin/readseq -v -p -f=pir test.fasta -otest.pir /usr/bin/readseq -v -p -f=ig test.pir -otest.ig-b -diff test.ig test.ig-b @echo "" @echo "Test for multiple-sequence format conversions:" /usr/bin/readseq -p -f=ig multi.std -otest.m-ig /usr/bin/readseq -p -f=gb test.m-ig -otest.m-gb /usr/bin/readseq -p -f=nbrf test.m-gb -otest.m-nbrf /usr/bin/readseq -p -f=embl test.m-nbrf -otest.m-embl /usr/bin/readseq -p -f=fasta test.m-embl -otest.m-fasta /usr/bin/readseq -p -f=pir test.m-fasta -otest.m-pir /usr/bin/readseq -p -f=msf test.m-pir -otest.m-msf /usr/bin/readseq -p -f=paup test.m-msf -otest.m-paup /usr/bin/readseq -p -f=ig test.m-paup -otest.m-ig-b -diff test.m-ig test.m-ig-b # # if using NCBI, uncomment these lines @echo "" @echo "Test of NCBI ASN.1 conversions:" /usr/bin/readseq -p -f=asn test.m-ig -otest.m-asn /usr/bin/readseq -p -f=ig test.m-asn -otest.m-ig-c -diff test.m-ig test.m-ig-c # @echo "" @echo "Expect differences in the header lines due to" @echo "different format headers. If any sequence lines" @echo "differ, or if the checksums differ, there is a problem." @echo "----------------------" @echo "" @echo "To clean up test files, command me:" @echo " make clean" clean: rm -f test.* debian/translate.patch0000644000000000000000000000733111652531702012212 0ustar ## ## Patch to readseq to allow input sequence symbols to be translated ## to different output sequence symbols. This has not been tested, ## as of 22 Dec 94. D.Gilbert ## diff -bwrc old/Makefile rtrans/Makefile *** old/Makefile Thu Dec 22 08:58:41 1994 --- rtrans/Makefile Thu Dec 22 08:58:31 1994 *************** *** 13,21 **** CFLAGS= #CFLAGS= -DSMALLCHECKSUM # if you prefer to use a GCG-standard 13 bit checksum # instead of a full 32 bit checksum. This may enhance compatibility w/ GCG software SOURCES= readseq.c ureadseq.c ureadseq.h ureadasn.c ! DOCS= Readme Readseq.help Formats Stdfiles Makefile Make.com add.gdemenu *.std # NCBI toolkit support for ASN.1 reader --- 13,23 ---- CFLAGS= #CFLAGS= -DSMALLCHECKSUM # if you prefer to use a GCG-standard 13 bit checksum # instead of a full 32 bit checksum. This may enhance compatibility w/ GCG software + #CFLAGS= -DTRANSLATE # if you want option to translate certain input sequence symbols + # to a different output sequence symbols SOURCES= readseq.c ureadseq.c ureadseq.h ureadasn.c ! DOCS= Readme readseq.help Formats Stdfiles Makefile Make.com add.gdemenu *.std # NCBI toolkit support for ASN.1 reader diff -bwrc old/readseq.c rtrans/readseq.c *** old/readseq.c Thu Dec 22 08:58:41 1994 --- rtrans/readseq.c Thu Dec 22 09:01:05 1994 *************** *** 338,343 **** --- 338,347 ---- fprintf(stderr, " -o[utput=]out.seq redirect Output\n"); fprintf(stderr, " -p[ipe] Pipe (command line, stdout)\n"); fprintf(stderr, " -r[everse] change to Reverse-complement\n"); + #ifdef TRANSLATE + fprintf(stderr, " -t[ranslate=]io translate input symbol [i] to output symbol [o] \n"); + fprintf(stderr, " use several -tio to translate several symbols \n"); + #endif fprintf(stderr, " -v[erbose] Verbose progress\n"); fprintf(stderr, " -f[ormat=]# Format number for output, or\n"); fprintf(stderr, " -f[ormat=]Name Format name for output:\n"); *************** *** 474,479 **** --- 478,486 ---- foo = NULL; gPrettyInit(gPretty); + #ifdef TRANSLATE + gTranslateInit(); + #endif } *************** *** 590,595 **** --- 597,615 ---- outform = parseformat( sparam); return kOptOkay; } + + #ifdef TRANSLATE + if (checkopt( false, sopt, "-translate", 3)) {/* -translate=io */ + if (*sparam==0) { for (sparam= sopt+3; isalpha(*sparam); sparam++) ; } + if (*sparam) gTranslate[*sparam]= sparam[1]; + return kOptOkay; + } + if (checkopt( false, sopt, "-t", 2)) { /* shorthand is -tio */ + if (*sparam==0) sparam= sopt+2; + if (*sparam) gTranslate[*sparam]= sparam[1]; + return kOptOkay; + } + #endif if (checkopt( false, sopt, "-output", 3)) {/* -output=myseq */ if (*sparam==0) { for (sparam= sopt+3; isalpha(*sparam); sparam++) ; } diff -bwrc old/ureadseq.c rtrans/ureadseq.c *** old/ureadseq.c Thu Dec 22 08:58:41 1994 --- rtrans/ureadseq.c Thu Dec 22 09:01:43 1994 *************** *** 162,168 **** --- 162,172 ---- } else V->seq = ptr; } + #ifdef TRANSLATE + V->seq[(V->seqlen)++] = gTranslate[*s]; + #else V->seq[(V->seqlen)++] = *s; + #endif } s++; } diff -bwrc old/ureadseq.h rtrans/ureadseq.h *** old/ureadseq.h Thu Dec 22 08:58:41 1994 --- rtrans/ureadseq.h Thu Dec 22 08:58:28 1994 *************** *** 113,118 **** --- 113,126 ---- extern prettyopts gPretty; #endif + #ifdef TRANSLATE + #ifdef UREADSEQ_G + char gTranslate[256]; + #else + extern char gTranslate[256]; + #endif + #define gTranslateInit() { short c; for(c=0; c<256; c++) gTranslate[c]= c; } + #endif #ifdef __cplusplus extern "C" { debian/source/0000755000000000000000000000000011652533256010476 5ustar debian/source/format0000644000000000000000000000001411301174376011677 0ustar 3.0 (quilt) debian/links0000644000000000000000000000010211652531702010224 0ustar usr/share/doc/readseq/Readseq.help usr/share/readseq/readseq_help debian/compat0000644000000000000000000000000212242412365010365 0ustar 9 debian/install0000644000000000000000000000015312242420025010547 0ustar readseq usr/bin debian/tests/Makefile usr/share/doc/readseq/tests *.std usr/share/doc/readseq/tests debian/get-orig-source0000644000000000000000000000133511570477204012135 0ustar #!/bin/sh prog=readseq version=1 dir="$prog"-"$version" tardir=`pwd`/../tarballs/"$dir" # url="ftp://ftp.bio.indiana.edu/molbio/readseq/version1/src" url="http://iubio.bio.indiana.edu/soft/molbio/readseq/version1/src/" mkdir -p "$tardir" cd "$tardir" files="add.gdemenu alphabet.std Formats macinit.c macinit.r Make.com Makefile Make.ncbi multi.std nucleic.std Readme readseq.c Readseq.help readseqSIOW.make Stdfiles upper.std ureadasn.c ureadseq.c ureadseq.h" for file in $files ; do wget "$url/$file" done cd .. GZIP="--best --no-name" tar -czf "$prog"_"$version".orig.tar.gz "$dir" rm -rf "$dir" debian/patches/0000755000000000000000000000000012260574525010625 5ustar debian/patches/enable_tests.patch0000644000000000000000000000532512242414276014317 0ustar Author: Andreas Tille LastChanged: Mon, 18 Nov 2013 14:22:50 +0100 Description: Make sure the just builded readseq will be used for testing --- a/Makefile +++ b/Makefile @@ -53,41 +53,41 @@ build: $(SOURCES) test: @echo "" @echo "Test for general read/write of all chars:" - readseq -p alphabet.std -otest.alpha + ./readseq -p alphabet.std -otest.alpha -diff test.alpha alphabet.std @echo "" @echo "Test for valid format conversions:" - readseq -v -p -f=ig nucleic.std -otest.ig - readseq -v -p -f=gb test.ig -otest.gb - readseq -v -p -f=nbrf test.gb -otest.nbrf - readseq -v -p -f=embl test.nbrf -otest.embl - readseq -v -p -f=gcg test.embl -otest.gcg - readseq -v -p -f=strider test.gcg -otest.strider - readseq -v -p -f=fitch test.strider -otest.fitch - readseq -v -p -f=fasta test.fitch -otest.fasta - readseq -v -p -f=pir test.fasta -otest.pir - readseq -v -p -f=ig test.pir -otest.ig-b + ./readseq -v -p -f=ig nucleic.std -otest.ig + ./readseq -v -p -f=gb test.ig -otest.gb + ./readseq -v -p -f=nbrf test.gb -otest.nbrf + ./readseq -v -p -f=embl test.nbrf -otest.embl + ./readseq -v -p -f=gcg test.embl -otest.gcg + ./readseq -v -p -f=strider test.gcg -otest.strider + ./readseq -v -p -f=fitch test.strider -otest.fitch + ./readseq -v -p -f=fasta test.fitch -otest.fasta + ./readseq -v -p -f=pir test.fasta -otest.pir + ./readseq -v -p -f=ig test.pir -otest.ig-b -diff test.ig test.ig-b @echo "" @echo "Test for multiple-sequence format conversions:" - readseq -p -f=ig multi.std -otest.m-ig - readseq -p -f=gb test.m-ig -otest.m-gb - readseq -p -f=nbrf test.m-gb -otest.m-nbrf - readseq -p -f=embl test.m-nbrf -otest.m-embl - readseq -p -f=fasta test.m-embl -otest.m-fasta - readseq -p -f=pir test.m-fasta -otest.m-pir - readseq -p -f=msf test.m-pir -otest.m-msf - readseq -p -f=paup test.m-msf -otest.m-paup - readseq -p -f=ig test.m-paup -otest.m-ig-b + ./readseq -p -f=ig multi.std -otest.m-ig + ./readseq -p -f=gb test.m-ig -otest.m-gb + ./readseq -p -f=nbrf test.m-gb -otest.m-nbrf + ./readseq -p -f=embl test.m-nbrf -otest.m-embl + ./readseq -p -f=fasta test.m-embl -otest.m-fasta + ./readseq -p -f=pir test.m-fasta -otest.m-pir + ./readseq -p -f=msf test.m-pir -otest.m-msf + ./readseq -p -f=paup test.m-msf -otest.m-paup + ./readseq -p -f=ig test.m-paup -otest.m-ig-b -diff test.m-ig test.m-ig-b # # if using NCBI, uncomment these lines @echo "" @echo "Test of NCBI ASN.1 conversions:" - readseq -p -f=asn test.m-ig -otest.m-asn - readseq -p -f=ig test.m-asn -otest.m-ig-c + ./readseq -p -f=asn test.m-ig -otest.m-asn + ./readseq -p -f=ig test.m-asn -otest.m-ig-c -diff test.m-ig test.m-ig-c # @echo "" debian/patches/30-arb-code-patches.patch0000644000000000000000000004022211652531702015161 0ustar Author: Andreas Tille Description: Several patches for arb that enable new formats and contain other needed fixes diff -ubrN readseq-1.orig/readseq.c readseq-1/readseq.c --- readseq-1.orig/readseq.c 1993-02-01 01:00:00.000000000 +0100 +++ readseq-1/readseq.c 2007-11-14 12:14:36.000000000 +0100 @@ -93,6 +93,10 @@ = fix bug for possible memory overrun when truncating seqs for Phylip or Paup formats (thanks Anthony Persechini) + 13 Sep 96 GSt + RL (Steger@biophys.uni-duesseldorf.de) + * real time in MSF format (Main); #include + + added VIE multi sequence file format + + added LinAll sequence file format */ @@ -169,8 +173,11 @@ #include +#include /* MSch */ #include #include +#include /* RL */ +#include #include "ureadseq.h" @@ -199,9 +206,11 @@ "16. ASN.1", "17. PAUP/NEXUS", "18. Pretty (out-only)", + "19. LinAll", + "20. Vienna", "" }; -#define kFormCount 30 +#define kFormCount 32 #define kMaxFormName 15 const struct formatTable { @@ -238,6 +247,8 @@ {"paup", kPAUP}, {"nexus", kPAUP}, {"pretty", kPretty}, + {"linall", kLINALL}, + {"vie", kVIE}, }; const char *kASN1headline = "Bioseq-set ::= {\nseq-set {\n"; @@ -415,7 +426,7 @@ fprintf( stderr, " %-20s %-20s\n", formats[i], formats[midi+i]); fprintf(stderr,"\nChoose an output format (name or #): \n"); - gets(sform); + fgets(sform, 127, stdin); outform = parseformat(sform); if (outform == kNoformat) outform = kPearson; return outform; @@ -708,8 +719,12 @@ #else #define Exit(a) exit(a) +#ifdef NCBI +Nlm_Int2 Nlm_Main(void) +#else main( int argc, char *argv[]) #endif +#endif { boolean closein = false; short ifile, nseq, atseq, format, err = 0, seqtype = kDNA, @@ -721,6 +736,14 @@ char stempstore[256], *stemp = stempstore; FILE *ftmp, *fin, *fout; long outindexmax= 0, noutindex= 0, *outindex = NULL; +time_t time_val; /* GSt + RL */ +size_t size_timestr = 50;/* GSt + RL */ +char timestr[50]; /* GSt + RL */ + +#ifdef NCBI +int argc; +char** argv; +#endif #define exit_main(err) { \ if (closeout) fclose(fout); \ @@ -739,6 +762,10 @@ resetGlobals(); +#if NCBI + argc = Nlm_GetArgc(); + argv = Nlm_GetArgv(); +#endif foo = stdout; progname = argv[0]; *oname = 0; @@ -764,7 +791,7 @@ quietly = (dopipe || (gotinputfile && (listonly || whichSeq != 0))); - if (verbose || (!quietly && !gotinputfile)) fprintf( stderr, title); + //if (verbose || (!quietly && !gotinputfile)) fprintf( stderr, "%s\n", title); ifile = 1; /* UI: Choose output */ @@ -1003,6 +1030,13 @@ else if (dolower) for (i = 0; i +#define __NO_CTYPE +#include /* MSch */ #include #include #define UREADSEQ_G #include "ureadseq.h" +/* changed according to original which is the same with the changed header (at) */ #pragma segment ureadseq @@ -66,7 +69,7 @@ # define Local static /* local functions */ #endif -#define kStartLength 500 +#define kStartLength 500000 /* 20Apr93 temp. bug fix */ const char *aminos = "ABCDEFGHIKLMNPQRSTVWXYZ*"; const char *primenuc = "ACGTU"; @@ -101,6 +104,9 @@ long linestart; char s[256], *sp; +#ifdef ARB + int (*isseqcharfirst8)(); /* Patch by o. strunk (ARB) to allow numbers in genbank sequences*/ +#endif int (*isseqchar)(); /* int (*isseqchar)(int c); << sgi cc hates (int c) */ }; @@ -150,9 +156,23 @@ Local void addseq(char *s, struct ReadSeqVars *V) { char *ptr; +#ifdef ARB + /* Patch by o. strunk (ARB) to allow numbers in genbank sequences */ + int count = 0; +#endif +#ifdef ARB + if (V->addit){ + for (;*s != 0;s++,count++) { + if (count < 9 && V->isseqcharfirst8) { + if (!(V->isseqcharfirst8) (*s)) continue; + }else{ + if (!(V->isseqchar) (*s)) continue; + } +#else if (V->addit) while (*s != 0) { if ((V->isseqchar)(*s)) { +#endif if (V->seqlen >= V->maxseq) { V->maxseq += kStartLength; ptr = (char*) realloc(V->seq, V->maxseq+1); @@ -164,7 +184,9 @@ } V->seq[(V->seqlen)++] = *s; } +#ifndef ARB s++; +#endif } } @@ -324,6 +346,11 @@ Local void readGenBank(struct ReadSeqVars *V) { /*GenBank -- many seqs/file */ +#ifdef ARB + /* Patch by o. strunk (ARB) to allow numbers in genbank sequences */ + V->isseqchar = isSeqNumChar; + V->isseqcharfirst8 = isSeqChar; +#endif while (!V->allDone) { strcpy(V->seqid, (V->s)+12); while (! (feof(V->f) || strstr(V->s,"ORIGIN") == V->s)) @@ -337,9 +364,44 @@ } if (feof(V->f)) V->allDone = true; } +#ifdef ARB + V->isseqchar = isSeqChar; + V->isseqcharfirst8 = 0; +#endif } +Local boolean endVIE( boolean *addend, boolean *ungetend, struct ReadSeqVars *V) /* GSt + RL */ +{ + if (*V->s == '>') { /* start of next seq */ + *addend = false; + *ungetend= true; + return(true); + } + else + return(false); +} + + +Local void readVIE(struct ReadSeqVars *V) /* GSt + RL */ +{ + while (!V->allDone) { + strcpy(V->seqid, (V->s)+2); + readLoop(0, false, endVIE, V); + if (feof(V->f)) V->allDone = true; + } +/* + printf("readVIE: V->nseq = %d\n",V->nseq); + printf("readVIE: V->choice = %d\n",V->choice); + printf("readVIE: V->addit = %d\n",V->addit); + printf("readVIE: V->seqlen = %ld\n",V->seqlen); + printf("readVIE: V->seqid = %s\n",V->seqid); + printf("readVIE: V->s = %s\n",V->s); + printf("readVIE: V->seqid = %s\n",V->seqid); + printf("readVIE: V->s = %s\n<<s); +*/ +} + Local boolean endNBRF( boolean *addend, boolean *ungetend, struct ReadSeqVars *V) { char *a; @@ -449,6 +511,46 @@ } } +Local void readLINALL(struct ReadSeqVars *V) /* GSt */ +{ + /* SeqLen[I4] Label[Char*60] + Seq[Char*70 per line] + */ + int laenge; + int i; + + V->nseq++; /* but there is only a single sequence ? */ + // dprintf(("readLINALL: V->nseq = %d\n",V->nseq)); + /* V->addit = (V->choice > 0); */ /* what's that for ???? */ + // dprintf(("readLINALL: V->choice = %d\n",V->choice)); + // dprintf(("readLINALL: V->addit = %d\n",V->addit)); + // dprintf(("readLINALL: V->seqid = %s\n",V->seqid)); + // dprintf(("readLINALL: V->s = %s\n",V->s)); + /* if (V->addit) V->seqlen = 0; */ /* what's that for ???? */ + sscanf(V->s, "%4d", &laenge); /* seqlen is in 1st 4 chars of 1st line */ + // dprintf(("readLINALL: laenge = %d\n",laenge)); + // fflush(stdout); + strcpy(V->seqid, (V->s)+5); /* label starts after 5th char of 1st line */ + // dprintf(("readLINALL: V->seqid = %s\n",V->seqid)); + // fflush(stdout); + do { + V->done = feof(V->f); + getline(V); + if (!V->done) addseq((V->s), V); + } while ( !(V->done) && (V->seqlen)<=laenge ); + V->seqlen = laenge; /* only laenge chars are relevant for V->seq */ + // dprintf(("readLINALL: V->s = %s\n",V->s)); + /* if (V->choice == kListSequences) addinfo(V->seqid, V); */ /* what's that for ???? */ + // dprintf(("readLINALL: V->seqid = %s\n",V->seqid)); + // dprintf(("readLINALL: V->seqlen = %ld\n",V->seqlen)); + // dprintf(("readLINALL: V->seq =>")); + // for ( i=0; iseqlen; i++ ) dprintf(("%c",V->seq[i])); + // dprintf(("<\n")); + // dprintf(("readLINALL: V->s = %s\n<<s)); + V->allDone = true; + +} + Local boolean endFitch( boolean *addend, boolean *ungetend, struct ReadSeqVars *V) @@ -956,6 +1058,8 @@ case kZuker : readZuker(V); break; case kOlsen : readOlsen(V); break; case kMSF : readMSF(V); break; + case kLINALL: readLINALL(V); break; + case kVIE : readVIE(V); break; case kPAUP : { boolean done= false; @@ -1049,6 +1153,9 @@ V.err = 0; V.nseq = 0; V.isseqchar = isSeqChar; +#ifdef ARB + V.isseqcharfirst8 = 0; +#endif if (V.choice == kListSequences) ; /* leave as is */ else if (V.choice <= 0) V.choice = 1; /* default ?? */ V.addit = (V.choice > 0); @@ -1092,6 +1199,9 @@ V.err = 0; V.nseq = 0; V.isseqchar = isSeqChar; +#ifdef ARB + V.isseqcharfirst8 = 0; +#endif if (V.choice == kListSequences) ; /* leave as is */ else if (V.choice <= 0) V.choice = 1; /* default ?? */ V.addit = (V.choice > 0); @@ -1152,6 +1262,7 @@ boolean foundDNA= false, foundIG= false, foundStrider= false, foundGB= false, foundPIR= false, foundEMBL= false, foundNBRF= false, foundPearson= false, foundFitch= false, foundPhylip= false, foundZuker= false, + foundLINALL= false, foundVIE= false, gotolsen= false, gotpaup = false, gotasn1 = false, gotuw= false, gotMSF= false, isfitch= false, isphylip= false, done= false; short format= kUnknown; @@ -1159,6 +1270,8 @@ char sp[256]; long linestart=0; int maxlines2check=500; + int linallSeqLen; + char linallHeader[60]; #define ReadOneLine(sp) \ { done |= (feof(fseq)); \ @@ -1225,8 +1338,9 @@ foundPIR= true; else if (*sp == '>') { - if (sp[3] == ';') foundNBRF= true; - else foundPearson= true; + if (sp[3] == ';') foundNBRF= true; /* {foundNBRF= true; printf("foundNBRF\n");} */ + else if (sp[1] == ' ') foundVIE= true; /* {foundVIE= true; printf("foundVIE\n");} */ + else foundPearson= true; /* {foundPearson= true; printf("foundPearson\n");} */ } else if (strstr(sp,"ID ") == sp) @@ -1239,9 +1353,16 @@ else { if (nlines - *skiplines == 1) { - int ispp= 0, ilen= 0; - sscanf( sp, "%d%d", &ispp, &ilen); - if (ispp > 0 && ilen > 0) isphylip= true; + int ispp= 0, ilen= 0, icnt=0; + char junkstr[120]; + memset(junkstr,0,120); + icnt= sscanf( sp, "%d%d%c", &ispp, &ilen, junkstr); + if (icnt == 2 && ispp > 0 && ilen > 0) { + isphylip= true; + } + else if (icnt==3 && ispp > 0 && ilen > 0 && *junkstr == ' ') { + isphylip= true; + } } else if (isphylip && nlines - *skiplines == 2) { int tseq; @@ -1257,6 +1378,65 @@ } if (isfitch & (splen > 20)) foundFitch= true; +#ifdef DEBUG_LINALL + dprintf(("Check for LINALL\n")); + dprintf(("\tstrtol(sp,NULL,0) = %d\n",strtol(sp,NULL,0))); + dprintf(("\tnlines = %d\n",nlines)); + dprintf(("\tisphylip = %d\n",isphylip)); + dprintf(("\tfoundPhylip = %d\n",foundPhylip)); + dprintf(("\tisfitch = %d\n",isfitch)); + dprintf(("\tfoundFitch = %d\n",foundFitch)); +#endif + + /* + * This format detection was highly bogus ... + * Lesson 1: always initialize variables (in case the conversion fails...) + * Lesson 2: strings are passed to sscanf _without_ & (string variable is a pointer already) + * Lesson 3: forget to check return codes from syscalls: you lose. + */ + + if (nlines==1) { + int rv; +#ifdef DEBUG_LINALL + int i, sane=1; + char *spp; + + /* + * possible sanity check, for losers (see above) + */ + for (spp=sp, i=0; i<4; i++, spp++) + if (!(isspace(*spp) || isdigit(*spp))) { + dprintf(("bogus linall format header: %s\n", sp)); + sane=0; + break; + } +#endif + + linallSeqLen = 0; + *linallHeader = '\0'; + rv = sscanf( sp, "%d %s", &linallSeqLen, linallHeader); + +#ifdef DEBUG_LINALL + dprintf(("\tsscanf rval = %d\n",rv)); + dprintf(("\tlinallSeqLen = %d\n",linallSeqLen)); + dprintf(("\tlinallHeader = %s\n",linallHeader)); + dprintf(("\tlinallHeader = %d\n",strlen(linallHeader))); +#endif + + if (rv > 0 && linallSeqLen>0 && + strlen(linallHeader)>0 && + !(isphylip || foundPhylip)) { + /* !(isphylip || foundPhylip || isfitch || foundFitch)) { */ + foundLINALL= true; /* The 1st line contains the seqlength (4 digits), a blank, and a label (up to 60 char). */ + /* The following lines contain the sequence with 70 chars per line. */ +#ifdef DEBUG_LINALL + dprintf(("debug: foundLINALL: %ld<\n",strtol(sp,NULL,0))); + dprintf(("debug: sp:%s<\n\n",sp)); +#endif + } + + } + /* kRNA && kDNA are fairly certain...*/ switch (getseqtype( sp, splen)) { case kOtherSeq: otherlines++; break; @@ -1302,6 +1482,16 @@ done= true; } + else if (foundLINALL) { + format= kLINALL; + done= true; + } + + else if (foundVIE) { + format= kVIE; + done= true; + } + else if ((dnalines > 1) || done || (nlines > maxlines2check)) { /* decide on most likely format */ /* multichar idents: */ @@ -1785,6 +1975,27 @@ linesout += 2; break; + case kLINALL: /* GSt */ + fprintf(outf,"%4d %-60s\n",seqlen,seqname); + strcpy(endstr,"\n"); + linesout++; + width = 70; + tab = 0; + spacer = 0; + nameleft = false; + nameright = false; + numleft = false; + numright = false; + break; + + case kVIE: /* GSt + RL */ + if ( strchr(seqname,' ') != NULL ) seqname[strchr(seqname,' ')-seqname] = '\0'; /* no blanks in label line */ + fprintf(outf,"> %-s\n", seqname); + linesout++; + fprintf(outf,"%s\n\n",seq); /* complete sequence in one line; additional blank line before next sequence */ + return linesout; /* thus, do nothing else */ + break; + default : case kZuker: /* don't attempt Zuker's ftn format */ case kPearson: @@ -1841,7 +2052,8 @@ s[l++] = ' '; if (!baseonlynum) ibase++; else if (0==strchr(nocountsymbols,seq[i])) ibase++; - s[l++] = seq[i++]; + if (outform==kLINALL) { s[l++] = to_upper(seq[i]); i++; } /* GSt */ + else s[l++] = seq[i++] ; } if (l1 == width || i == seqlen) { diff -ubrN readseq-1.orig/ureadseq.h readseq-1/ureadseq.h --- readseq-1.orig/ureadseq.h 1992-12-30 01:00:00.000000000 +0100 +++ readseq-1/ureadseq.h 2007-11-14 12:14:36.000000000 +0100 @@ -66,8 +66,10 @@ #define kASN1 16 #define kPAUP 17 #define kPretty 18 +#define kLINALL 19 +#define kVIE 20 -#define kMaxFormat 18 +#define kMaxFormat 20 #define kMinFormat 1 #define kNoformat -1 /* format not tested */ #define kUnknown 0 /* format not determinable */ @@ -100,7 +102,7 @@ p.noleaves= p.domatch= p.degap= false;\ p.matchchar='.';\ p.gapchar='-';\ - p.namewidth=8;\ + p.namewidth=10;\ p.numwidth=5;\ p.interline=1;\ p.spacer=10;\ debian/patches/series0000644000000000000000000000024112260553540012031 0ustar 20-Formats.patch 20-Makefile.patch 30-arb-code-patches.patch 552830.patch gcc-4.6_format-security.patch enable_tests.patch hardening.patch buffer_overflow.patch debian/patches/20-Makefile.patch0000644000000000000000000001111211652531702013570 0ustar Author: Andreas Tille Description: Enhanced Makefile --- readseq-1/Makefile.orig 1992-12-30 01:00:00.000000000 +0100 +++ readseq-1/Makefile 2007-11-17 21:31:39.000000000 +0100 @@ -10,7 +10,7 @@ #CC=cc # SGI Irix #CC=vcc # some DEC Ultrix -CFLAGS= +CFLAGS= -g -O2 #CFLAGS= -DSMALLCHECKSUM # if you prefer to use a GCG-standard 13 bit checksum # instead of a full 32 bit checksum. This may enhance compatibility w/ GCG software @@ -29,63 +29,66 @@ LIB2=-lncbiobj LIB3=-lncbicdr LIB4=-lvibrant -INCPATH=$(NCBI)/include -LIBPATH=$(NCBI)/lib +LIB5=-lncbimmdb -lncbiid1 -lnetcli +LIB6=-lncbiacc +LIB7=-lncbitool +INCPATH=/usr/include/ncbi +#LIBPATH=$(NCBI)/lib NCFLAGS=$(CFLAGS) -DNCBI -I$(INCPATH) -NLDFLAGS=-I$(INCPATH) -L$(LIBPATH) -NLIBS=$(LIB1) $(LIB2) $(OTHERLIBS) +NLDFLAGS=-I$(INCPATH) +NLIBS=$(LIB1) $(LIB2) $(LIB3) $(LIB6) $(LIB7) $(LIB5) $(OTHERLIBS) +ARBFLAGS=-DARB +all: build -all: build test - -build: $(SOURCES) - @echo "Compiling readseq..." - $(CC) $(CFLAGS) -o readseq readseq.c ureadseq.c +#build: $(SOURCES) +# @echo "Compiling readseq..." +# $(CC) $(CFLAGS) -o readseq readseq.c ureadseq.c # if using NCBI, uncomment these lines in place of build: above -#build: $(SOURCES) -# @echo "Compiling readseq with NCBI toolkit support..."; -# $(CC) -o readseq $(NLDFLAGS) $(NCFLAGS) readseq.c ureadseq.c ureadasn.c $(NLIBS) +build: $(SOURCES) + @echo "Compiling readseq with NCBI toolkit support and ARB patches"; + $(CC) -o readseq $(NLDFLAGS) $(NCFLAGS) $(ARBFLAGS) readseq.c ureadseq.c ureadasn.c $(NLIBS) -test: $(SOURCES) readseq +test: @echo "" @echo "Test for general read/write of all chars:" - ./readseq -p alphabet.std -otest.alpha + readseq -p alphabet.std -otest.alpha -diff test.alpha alphabet.std @echo "" @echo "Test for valid format conversions:" - ./readseq -v -p -f=ig nucleic.std -otest.ig - ./readseq -v -p -f=gb test.ig -otest.gb - ./readseq -v -p -f=nbrf test.gb -otest.nbrf - ./readseq -v -p -f=embl test.nbrf -otest.embl - ./readseq -v -p -f=gcg test.embl -otest.gcg - ./readseq -v -p -f=strider test.gcg -otest.strider - ./readseq -v -p -f=fitch test.strider -otest.fitch - ./readseq -v -p -f=fasta test.fitch -otest.fasta - ./readseq -v -p -f=pir test.fasta -otest.pir - ./readseq -v -p -f=ig test.pir -otest.ig-b + readseq -v -p -f=ig nucleic.std -otest.ig + readseq -v -p -f=gb test.ig -otest.gb + readseq -v -p -f=nbrf test.gb -otest.nbrf + readseq -v -p -f=embl test.nbrf -otest.embl + readseq -v -p -f=gcg test.embl -otest.gcg + readseq -v -p -f=strider test.gcg -otest.strider + readseq -v -p -f=fitch test.strider -otest.fitch + readseq -v -p -f=fasta test.fitch -otest.fasta + readseq -v -p -f=pir test.fasta -otest.pir + readseq -v -p -f=ig test.pir -otest.ig-b -diff test.ig test.ig-b @echo "" @echo "Test for multiple-sequence format conversions:" - ./readseq -p -f=ig multi.std -otest.m-ig - ./readseq -p -f=gb test.m-ig -otest.m-gb - ./readseq -p -f=nbrf test.m-gb -otest.m-nbrf - ./readseq -p -f=embl test.m-nbrf -otest.m-embl - ./readseq -p -f=fasta test.m-embl -otest.m-fasta - ./readseq -p -f=pir test.m-fasta -otest.m-pir - ./readseq -p -f=msf test.m-pir -otest.m-msf - ./readseq -p -f=paup test.m-msf -otest.m-paup - ./readseq -p -f=ig test.m-paup -otest.m-ig-b + readseq -p -f=ig multi.std -otest.m-ig + readseq -p -f=gb test.m-ig -otest.m-gb + readseq -p -f=nbrf test.m-gb -otest.m-nbrf + readseq -p -f=embl test.m-nbrf -otest.m-embl + readseq -p -f=fasta test.m-embl -otest.m-fasta + readseq -p -f=pir test.m-fasta -otest.m-pir + readseq -p -f=msf test.m-pir -otest.m-msf + readseq -p -f=paup test.m-msf -otest.m-paup + readseq -p -f=ig test.m-paup -otest.m-ig-b -diff test.m-ig test.m-ig-b # # if using NCBI, uncomment these lines -# @echo "" -# @echo "Test of NCBI ASN.1 conversions:" -# ./readseq -p -f=asn test.m-ig -otest.m-asn -# ./readseq -p -f=ig test.m-asn -otest.m-ig-c -# -diff test.m-ig test.m-ig-c + @echo "" + @echo "Test of NCBI ASN.1 conversions:" + readseq -p -f=asn test.m-ig -otest.m-asn + readseq -p -f=ig test.m-asn -otest.m-ig-c + -diff test.m-ig test.m-ig-c # @echo "" @echo "Expect differences in the header lines due to" @@ -97,8 +100,13 @@ @echo " make clean" +install: + install readseq $(DESTDIR)/usr/bin + install *.std $(DESTDIR)/usr/share/doc/readseq/tests + install Makefile $(DESTDIR)/usr/share/doc/readseq/tests + clean: - rm -f *.o core test.* + rm -f *.o core test.* readseq shar: @echo "shell archiving files..." @@ -109,3 +117,7 @@ shar -v readseqd > readseq.shar rm -rf readseqd + + + + debian/patches/hardening.patch0000644000000000000000000000153412242416414013600 0ustar Author: Andreas Tille LastChanged: Mon, 18 Nov 2013 14:22:50 +0100 Description: Propagate hardening options --- a/Makefile +++ b/Makefile @@ -10,7 +10,7 @@ CC=gcc # Gnu C Compiler #CC=cc # SGI Irix #CC=vcc # some DEC Ultrix -CFLAGS= -g -O2 +CFLAGS+= -g -O2 #CFLAGS= -DSMALLCHECKSUM # if you prefer to use a GCG-standard 13 bit checksum # instead of a full 32 bit checksum. This may enhance compatibility w/ GCG software @@ -48,7 +48,7 @@ all: build # if using NCBI, uncomment these lines in place of build: above build: $(SOURCES) @echo "Compiling readseq with NCBI toolkit support and ARB patches"; - $(CC) -o readseq $(NLDFLAGS) $(NCFLAGS) $(ARBFLAGS) readseq.c ureadseq.c ureadasn.c $(NLIBS) + $(CC) -o readseq $(NLDFLAGS) $(NCFLAGS) $(ARBFLAGS) readseq.c ureadseq.c ureadasn.c $(NLIBS) $(LDFLAGS) test: @echo "" debian/patches/gcc-4.6_format-security.patch0000644000000000000000000000145712242414011016112 0ustar Author: Andreas Tille Date: Fri, 28 Oct 2011 16:15:21 +0200 Closes: #643465 Description: When using -Werror=format-security with gcc-4.6 some function calls are throwing erros. This patch replates [f]printf by [f]puts to fix the problem --- a/readseq.c +++ b/readseq.c @@ -335,7 +335,7 @@ void usage() { short i, midi; - fprintf(stderr,title); + fputs(title, stderr); fprintf(stderr, "usage: readseq [-options] in.seq > out.seq\n"); fprintf(stderr," options\n"); @@ -988,7 +988,7 @@ char** argv; if (seqout == 0) fprintf( foo,"\\\\\\\n"); break; case kASN1: - if (seqout == 0) fprintf( foo, kASN1headline); + if (seqout == 0) fputs(kASN1headline, foo); break; case kPhylip: debian/patches/20-Formats.patch0000644000000000000000000000153311652531702013474 0ustar Author: Andreas Tille Description: Add new Formats for ARB --- readseq-1/Formats.orig 1992-12-30 01:00:00.000000000 +0100 +++ readseq-1/Formats 2007-11-14 12:14:36.000000000 +0100 @@ -978,3 +978,19 @@ hist Seq-hist OPTIONAL } -- sequence history ------------------------------------------------ + +||||||||||| LinAll sequence file format +---------------------------------------- + +1234 seq1-id (1234 is sequence length, right justified) +abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzab +cdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz +(80 characters per line, one sequence per file) + +||||||||||| Vienna RNA sequence file format +-------------------------------------------- + +> seq1-id +jsdhkasjlhsdlkjcbsd ... (one single line) +> seq2-id +odjgoirhggonavdskgj ... (one single line) debian/patches/552830.patch0000644000000000000000000001544611652531702012420 0ustar Author: Ruben Molina Description: Fixes error: conflicting types for 'getline' problem (#552830) --- readseq-1.orig/ureadseq.c +++ readseq-1/ureadseq.c @@ -142,7 +142,7 @@ } } -Local void getline(struct ReadSeqVars *V) +Local void mygetline(struct ReadSeqVars *V) { readline(V->f, V->s, &V->linestart); } @@ -237,7 +237,7 @@ if (addfirst) addseq(V->s, V); do { - getline(V); + mygetline(V); V->done = feof(V->f); V->done |= (*endTest)( &addend, &ungetend, V); if (V->addit && (addend || !V->done) && (strlen(V->s) > margin)) { @@ -268,7 +268,7 @@ while (!V->allDone) { do { - getline(V); + mygetline(V); for (si= V->s; *si != 0 && *si < ' '; si++) *si= ' '; /* drop controls */ if (*si == 0) *V->s= 0; /* chop line to empty */ } while (! (feof(V->f) || ((*V->s != 0) && (*V->s != ';') ) )); @@ -294,13 +294,13 @@ { /* ? only 1 seq/file ? */ while (!V->allDone) { - getline(V); + mygetline(V); if (strstr(V->s,"; DNA sequence ") == V->s) strcpy(V->seqid, (V->s)+16); else strcpy(V->seqid, (V->s)+1); while ((!feof(V->f)) && (*V->s == ';')) { - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; else readLoop(0, true, endStrider, V); @@ -320,16 +320,16 @@ while (!V->allDone) { while (! (feof(V->f) || strstr(V->s,"ENTRY") || strstr(V->s,"SEQUENCE")) ) - getline(V); + mygetline(V); strcpy(V->seqid, (V->s)+16); while (! (feof(V->f) || strstr(V->s,"SEQUENCE") == V->s)) - getline(V); + mygetline(V); readLoop(0, false, endPIR, V); if (!V->allDone) { while (! (feof(V->f) || ((*V->s != 0) && (strstr( V->s,"ENTRY") == V->s)))) - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; } @@ -354,13 +354,13 @@ while (!V->allDone) { strcpy(V->seqid, (V->s)+12); while (! (feof(V->f) || strstr(V->s,"ORIGIN") == V->s)) - getline(V); + mygetline(V); readLoop(0, false, endGB, V); if (!V->allDone) { while (! (feof(V->f) || ((*V->s != 0) && (strstr( V->s,"LOCUS") == V->s)))) - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; } @@ -426,11 +426,11 @@ { while (!V->allDone) { strcpy(V->seqid, (V->s)+4); - getline(V); /*skip title-junk line*/ + mygetline(V); /*skip title-junk line*/ readLoop(0, false, endNBRF, V); if (!V->allDone) { while (!(feof(V->f) || (*V->s != 0 && *V->s == '>'))) - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; } @@ -452,7 +452,7 @@ readLoop(0, false, endPearson, V); if (!V->allDone) { while (!(feof(V->f) || ((*V->s != 0) && (*V->s == '>')))) - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; } @@ -472,14 +472,14 @@ while (!V->allDone) { strcpy(V->seqid, (V->s)+5); do { - getline(V); + mygetline(V); } while (!(feof(V->f) | (strstr(V->s,"SQ ") == V->s))); readLoop(0, false, endEMBL, V); if (!V->allDone) { while (!(feof(V->f) | ((*V->s != '\0') & (strstr(V->s,"ID ") == V->s)))) - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; } @@ -499,13 +499,13 @@ /*! 1st string is Zuker's Fortran format */ while (!V->allDone) { - getline(V); /*s == "seqLen seqid string..."*/ + mygetline(V); /*s == "seqLen seqid string..."*/ strcpy(V->seqid, (V->s)+6); readLoop(0, false, endZuker, V); if (!V->allDone) { while (!(feof(V->f) | ((*V->s != '\0') & (*V->s == '(')))) - getline(V); + mygetline(V); } if (feof(V->f)) V->allDone = true; } @@ -535,7 +535,7 @@ // fflush(stdout); do { V->done = feof(V->f); - getline(V); + mygetline(V); if (!V->done) addseq((V->s), V); } while ( !(V->done) && (V->seqlen)<=laenge ); V->seqlen = laenge; /* only laenge chars are relevant for V->seq */ @@ -588,7 +588,7 @@ do { addseq(V->s, V); V->done = feof(V->f); - getline(V); + mygetline(V); } while (!V->done); if (V->choice == kListSequences) addinfo(V->seqid, V); V->allDone = true; @@ -614,7 +614,7 @@ else if (si = strstr(V->seqid,"..")) *si = 0; do { V->done = feof(V->f); - getline(V); + mygetline(V); if (!V->done) addseq((V->s), V); } while (!V->done); if (V->choice == kListSequences) addinfo(V->seqid, V); @@ -633,7 +633,7 @@ if (V->addit) V->seqlen = 0; rewind(V->f); V->nseq= 0; do { - getline(V); + mygetline(V); V->done = feof(V->f); if (V->done && !(*V->s)) break; @@ -716,7 +716,7 @@ if (V->addit) V->seqlen = 0; rewind(V->f); V->nseq= 0; do { - getline(V); + mygetline(V); V->done = feof(V->f); if (V->done && !(*V->s)) break; @@ -787,7 +787,7 @@ domatch= (V->matchchar > 0); do { - getline(V); + mygetline(V); V->done = feof(V->f); if (V->done && !(*V->s)) break; @@ -868,7 +868,7 @@ /* rewind(V->f); V->nseq= 0; << do in caller !*/ indata= true; /* call here after we find "matrix" */ do { - getline(V); + mygetline(V); V->done = feof(V->f); if (V->done && !(*V->s)) break; @@ -953,7 +953,7 @@ /* fprintf(stderr,"Phylip-ileaf: topnseq=%d topseqlen=%d\n",V->topnseq, V->topseqlen); */ do { - getline(V); + mygetline(V); V->done = feof(V->f); if (V->done && !(*V->s)) break; @@ -1006,7 +1006,7 @@ while (isdigit(*si)) si++; skipwhitespace(si); V->topseqlen= atol(si); - getline(V); + mygetline(V); while (!V->allDone) { V->seqlencount= 0; strncpy(V->seqid, (V->s), 10); @@ -1037,10 +1037,10 @@ V->err = eFileNotFound; else { - for (l = skiplines_; l > 0; l--) getline( V); + for (l = skiplines_; l > 0; l--) mygetline( V); do { - getline( V); + mygetline( V); for (l= strlen(V->s); (l > 0) && (V->s[l] == ' '); l--) ; } while ((l == 0) && !feof(V->f)); @@ -1067,7 +1067,7 @@ char *cp; /* rewind(V->f); V->nseq= 0; ?? assume it is at top ?? skiplines ... */ while (!done) { - getline( V); + mygetline( V); tolowerstr( V->s); if (strstr( V->s, "matrix")) done= true; if (strstr( V->s, "interleav")) interleaved= true; @@ -1099,7 +1099,7 @@ break; case kFitch : - strcpy(V->seqid, V->s); getline(V); + strcpy(V->seqid, V->s); mygetline(V); readFitch(V); break; @@ -1107,7 +1107,7 @@ do { gotuw = (strstr(V->s,"..") != NULL); if (gotuw) readUWGCG(V); - getline(V); + mygetline(V); } while (!(feof(V->f) || V->allDone)); break; } debian/patches/buffer_overflow.patch0000644000000000000000000000102112260574525015034 0ustar Author: Michael Bienia Last-Update: 30 Dec 2013 18:34:52 +0100 Bug-Debian: http://bugs.debian.org/733650 Description: Fix buffer overflow in ureadseq.c --- readseq-1.orig/ureadseq.c +++ readseq-1/ureadseq.c @@ -1768,7 +1768,7 @@ short linesout = 0, seqtype = kNucleic; long i, j, l, l1, ibase; - char idword[31], endstr[10]; + char idword[31], endstr[14]; char seqnamestore[128], *seqname = seqnamestore; char s[kMaxseqwidth], *cp; char nameform[10], numform[10], nocountsymbols[10]; debian/rules0000755000000000000000000000027212242417220010243 0ustar #!/usr/bin/make -f # debian/rules for readseq # Andreas Tille , GPL %: dh $@ override_dh_clean: dh_clean rm -f readseq get-orig-source: . debian/get-orig-source debian/README.debian0000644000000000000000000000374211303466617011304 0ustar readseq for DEBIAN ------------------ This version is built against the NCBI libraries but it seems quite buggy: please report any problem with ASN/1 conversions . I need this package to build Arb (http://www.arb-home.de). There was a bug report (#43372) about a new version (version 2). I decided to use the latest sources of version 1 which are in C because I need exactly this C version. I will not package version 2 for the following reasons: - Java adds en extra dependency of java-virtual-machine which would bother my users and me - most certainly it would let readseq go to contrib which I want to avoid (not checked) - I really need the C version for Arb. If somebody really wants to have readseq Version 2 packaged (which might make sense for sure) this should be made as separate package readseq2. In this case please double check, whether it can go into main. The new version can be found at ftp://ftp.bio.indiana.edu/molbio/readseq/version2 Quote from the documentation of version 2: This release version 2, first available in 1999, continues support for the "classic" C version, in that it supports the same command-line options, but has extensions for sequence documentation, feature table and other additions, plus new sequence format conversions, and a lot of bug fixing. This java version is also more efficient, working faster than the compiled C classic version. It still isn't efficient enough to handle large sequences (genome sized or full GenBank/EMBL data release files). Andreas Tille Thu, 25 Oct 2001 09:42:26 +0200 Starting with readseq 1-4, support for two more sequence formats was added: LinAll format: used by the 'LinAll' and 'ConStruct' RNA structure packages from the Duesseldorf biophysics group. Vienna RNA format: used by the Vienna RNA package available from the Theoretical Chemistry group at Vienna University. Michael Schmitz Wed, 11 Dec 2002 13:38:05 +0200 debian/control0000644000000000000000000000151512242412430010565 0ustar Source: readseq Maintainer: Debian Med Packaging Team Uploaders: Andreas Tille Section: science Priority: optional Build-Depends: debhelper (>= 9), ncbi-tools-dev Standards-Version: 3.9.4 Vcs-Browser: http://anonscm.debian.org/viewvc/debian-med/trunk/packages/readseq/trunk/ Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/readseq/trunk/ Homepage: http://iubio.bio.indiana.edu/soft/molbio/readseq/ Package: readseq Architecture: any Depends: ${shlibs:Depends}, ${misc:Depends} Description: Conversion between sequence formats Reads and writes nucleic/protein sequences in various formats. Data files may have multiple sequences. Readseq is particularly useful as it automatically detects many sequence formats, and converts between them. debian/docs0000644000000000000000000000007112242420114010027 0ustar Formats Readme Stdfiles Readseq.help debian/README.test debian/upstream0000644000000000000000000000062312214301056010744 0ustar Reference: Author: Don Gilbert Title: Sequence file format conversion with command-line readseq Journal: Current Protocols in Bioinformatics Year: 2003 Volume: Appendix 1 Pages: E DOI: 10.1002/0471250953.bia01es00 PMID: 18428689 URL: http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bia01es00/abstract eprint: http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bia01es00/pdf