opencv-python pdf2image pytesseract mysql-connector-python matplotlib numpy sentencepiece>=0.1.96 nltk transformers torch torchvision torchtext langid pillow requests