當我們要把list分成好幾個chunk時的幾種做法
yield
def chunks1(input_list, n):
for i in range(0, len(input_list), n):
yield input_list[i:i + n]
input_list = [i for i in range(0, 15)]
print(list(chunks(input_list, 4)))
# [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14]]
一行for迴圈
input_list = [i for i in range(0, 15)]
n = 3
output_list = [input_list[i:i+ n] for i in range(0, len(input_list), n)]
print(output_list)
# [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13, 14]]
iterable
針對任何iterable
from itertools import islice
def chunks2(input_iter, n):
input_list = iter(input_iter)
return iter(lambda: tuple(islice(input_list, n)), ())
input_list = [i for i in range(0, 15)]
n = 4
print(list(chunks2(input_list, n)))
# [(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14)]
Numpy
import numpy as np
input_list = [i for i in range(0, 15)]
np.array_split(input_list, 5)
# [array([0, 1, 2]),
# array([3, 4, 5]),
# array([6, 7, 8]),
# array([ 9, 10, 11]),
# array([12, 13, 14])]
上述幾種簡單的方式皆可達成
幾個注意的地方
- 針對的輸入類型是哪種,只能list還是也可以接受其他輸入格式?
- 每個例子都有用到n,n所控制的數字是甚麼?
- 一些特例處理
如果今天輸入檔案太大必須分批處理,試試panda的read_csv()