Взять все (однотипные) файлы в папке, да и конвертировать. Как их перебирать? Можно все (вместе с содержимым файлов) загнать в словарь... А молжно проще... Здесь фрагменты кода с os.listdir(path) и итерации по файлам в папке
!C:\Program Files\Xpdf\bin64\pdftotext.exe
!cmd.exe
import os
path = r'C:\abc\def\ghi' # remove the trailing '\'
data = {}
for dir_entry in os.listdir(path):
dir_entry_path = os.path.join(path, dir_entry)
if os.path.isfile(dir_entry_path):
with open(dir_entry_path, 'r') as my_file:
data[dir_entry] = my_file.read()
import os
path = r'C:\Users\kiss\Documents\Xpdf\aebru_2014_all' # remove the trailing '\'
data = {}
for dir_entry in os.listdir(path):
dir_entry_path = os.path.join(path, dir_entry)
if os.path.isfile(dir_entry_path):
with open(dir_entry_path, 'r') as my_file:
data[dir_entry] = my_file.read()
os.listdir(path)
data['eng_car-sales-in-april-2014.pdf']
appGet("c:\windows\system32\cmd.exe")
!tree C:\Users\kiss\Documents\Xpdf /F
To iterate through all the files within the specified directory (folder), with ability to use wildcards (*, ?, and [ ]-style ranges), use the following code snippet:
import os
import glob
path = 'sequences/'
for infile in glob.glob( os.path.join(path, '*.fasta') ):
print "current file is: " + infile
If you do not need wildcards, then there is a simpler way to list all items in a directory:
import os
path = 'sequences/'
listing = os.listdir(path)
for infile in listing:
print "current file is: " + infile
print was promoted from a statement to a function in Python 3 (use print(infile) instead of print infile).
One should use ‘os.path.join()’ part to make the script cross-platform-portable (different OS use different path separators, and hard-coding path separator would stop the script from executing under a different OS).
Python docs mention that there is also iglob(), which is an iterator and thus working on directories with way too many files it will save memory by returning only single result per iteration, and not the whole list of files – as glob() does.
Посты чуть ниже также могут вас заинтересовать
Комментариев нет:
Отправить комментарий