Green leaf




WriteWise - Artificial intelligence, Data mining, Deep learning, Natural language processing, Machine learning, Python, PyTorch, scikit-learn, Data Science, Tensorflow, Google Cloud


One of the products we are improving is a version of WriteWise that helps undergraduate and graduate students write thesis projects in Spanish. One feature that has been in high demand is a Spanish grammar and spell checker. To develop the Spanish grammar spell checker it is necessary to obtain and/or generate a corpus of Spanish text to retrain the pre-trained language models. The corpora to obtain/generate are academic texts (thesis, research articles). Also, texts with and without grammatical errors. For this, we will have to investigate the available Spanish databases and at the same time generate our own databases. Once the models have been trained, their performance (e.g., Accuracy) in the correction of advanced grammatical spelling errors focused on academic writing will have to be tested. In addition, basic spelling and grammar rules provided by linguists should be programmed. The specific skills that are required are as follows: - Indispensable. Advanced handling of language models: 1) Embeddings from Language Models (ELMo); 2) Generative Pretrained Transformer (GPT); 3) Bidirectional Encoder Representations from Transformers (BERT). Experience with state-of-the-art language models such as GPT-3 and Bloom is desirable. - Desirable, but not indispensable, experience with NLP tasks in Spanish. - Advanced Python programming (at least three years of experience). - Advanced experience with any NLP library (e.g. NLTK, Spacy, and/or Stanford CoreNLP, among others) or Machine Learning (e.g. Scikit-learn, TensorFlow, and/or PyTorch, among others). Advanced knowledge of Scikit-Learn is desirable. - Advanced experience in Text Mining and data analysis - Experience in the use of scrapers for corpus generation.

Years: Any

Location: Africa, Asia, Europe, Latin America, Middle East, Oceania, South Asia

Requested on: 2022-08-11

Artificial intelligence, Data mining, Deep learning, Natural language processing, Machine learning, Python, PyTorch, scikit-learn, Data Science, Tensorflow, Google Cloud