A Simple Python Script to Convert FASTA file to CSV format

I have searched online for converting FASTA file into CSV format for sequence visualization like the output from GPCRdb.org, but I didn’t find what I want. So I wrote this simple and dirty script to convert FASTA to CSV.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#!/usr/bin/python
#-*-coding:utf-8-*-

import sys, os


if len(sys.argv) <= 1:
print('\nPlease provide input FASTA file.\n')
print('Usage:\nfasta2csv.py input.fst output.csv\n')
sys.exit()

if sys.argv[1] == '-h' or sys.argv[1] == 'help'or sys.argv[1] == '-help':
print('\nUsage:\npython fasta2csv.py input.fst output.csv\n')
sys.exit()

input = sys.argv[1]
if not os.path.exists(input):
print('\nError: File "%s" is not exist!\n' % input)
sys.exit()

output = 'output.csv'
if len(sys.argv) > 2:
output = sys.argv[2]

# Read in FASTA
file = open(input, 'r')
lines_i = file.readlines()
seq = ''

for l in lines_i:
if l[0] == '>':
'Fasta head line'
seq_id = l.strip()
else:
'Sequence line'
seq += l.strip()

file.close()

print('The Input file is: %s' %input)


# Convert FASTA to CSV
l = []
lines = [str(seq_id) + '\n']
for i, c in enumerate(seq):
l.append(c)
if i % 60 == 59:
lines.append(','.join(l) + '\n')
l = []

if l != []:
lines.append(','.join(l) + '\n')



# Output CSV file
file = open(output, 'w')
file.writelines(lines)
file.close()
print('The Output file is: %s' %output)

This script is unmature but good for use. It can convert a FASTA file containing one sequence into a CSV file, each residue are a single elemnt of CSV file. It is verty straightfoward so anyone can simply modify it for their own purpose.

Download fasta2csv