Daniil Ignatev

PhD Candidate in Natural Language Processing · Utrecht University · Utrecht, Netherlands · d.ignatev@uu.nl

Hi! I'm a computational linguist and NLP researcher with a background in linguistics and data science. I currently work as a PhD Candidate in Natural Language Processing at Utrecht University; my recent work is on annotator disagreement, natural language entailment, discourse and pragmatics.


Experience

PhD Candidate

Utrecht University · Natural Language Processing

PhD Candidate in Natural Language Processing. Recent public research output includes work on annotator disagreement, and computational pragmatics.

Jan 2024 - Present
Faculty of Science · AI & Data Science

Data Scientist

Deeppavlov.ai

Worked on data annotation and training of classification algorithms for rhetorical structure prediction. Designed, implemented, and tested self-standing modules for an open-source chatbot framework in Python.

Nov 2021 - Dec 2023
Remote · Moscow, Russia

Junior Researcher

Higher School of Economics

Collected dialectal linguistic field data. Developed backend and frontend web interfaces for linguistic corpora. Automated morphological and thematic annotation of natural language texts, increasing corpora annotation rates.

Sep 2019 - Dec 2022
On-site · Moscow, Russia

Developer Intern

Apertium

Implemented a set of finite-state machine translation tools for Bagvalal, a sparse-resource language from the Nakh-Daghestanian language family. Built a test suite to support a test-driven development workflow.

Jun 2021 - Aug 2021

Education

Higher School of Economics

Master of Science in Computational Linguistics
Natural language processing and fundamental linguistics
Sep 2020 - May 2022
Moscow, Russia

Higher School of Economics

Bachelor of Science in Linguistics
Major: literature studies, historical linguistics, digital humanities · Minor: machine learning with Python
Sep 2016 - May 2020
Moscow, Russia

Skills

Programming Languages
  • Python, C++, JavaScript, R, PHP, HTML, CSS
ML Frameworks & Libraries
  • PyTorch, Keras
  • Huggingface, scikit-learn, CatBoost, Gensim
Databases & Tools
  • MySQL, PostgreSQL, ElasticSearch, MongoDB
  • Docker, Git, GitHub, GitHub Workflows

Projects

Pastandnow.ru

Flask, Pandas, MySQL, JavaScript · Website

Designed and developed an ETL pipeline for document indexation and search using Flask and MySQL. Designed a responsive interface to ensure device compatibility.

folkore.linghub.ru

Python, Flask, MySQL, ElasticSearch · Website

Improved and maintained an open-source web interface for a linguistic corpus of dialectal Russian. Upgraded document indexation routines using ElasticSearch.

north-folklore.ru

Python, Flask, MySQL, JavaScript · Website

Designed, deployed, and maintained an open-source web interface for a linguistic corpus of dialectal Russian.

DH Hackathon, HSE

Python, Gensim

Participated in the Digital Humanities Hackathon at Higher School of Economics and applied collocation analysis to diachronic corpora of the Russian language.

Unisearch

Python, Flask · Source Code

Designed and implemented an asynchronous search utility for indexing and searching texts.

ETL

Python · Source Code

ETL pipeline for extracting information from Word documents.

Epigrafika.ru

PHP, Symfony, Bootstrap · Website

Maintained and updated the corpus website, improved general web interface functionality using Symfony, and designed and implemented a RESTful API for corpus export.

Old Russian Prosopography Database

Django, Bootstrap · Website

Implementation and maintenance of the corpus website.


Selected Publications

Human Label Variation in Implicit Discourse Relation Recognition

arXiv preprint · 2026

Links: arXiv

Don't Learn, Ground: A Case for Natural Language Inference with Visual Grounding

arXiv preprint · 2025

Links: arXiv

Disentangling the Roles of Representation and Selection in Data Pruning

ACL (Long Papers) · 2025

Links: ACL Anthology · arXiv

Hypernetworks for Perspectivist Adaptation

NLPerspectives (Perspectivist Approaches to NLP Workshop) · 2025

Links: ACL Anthology · arXiv

DeMeVa at LeWiDi-2025: Modeling Perspectives with In-Context Learning and Label Distribution Learning

NLPerspectives (LeWiDi 2025 shared task system paper) · 2025

Links: ACL Anthology PDF

Annotator disagreement in RST annotation schemes

Society for Computation in Linguistics · 2025

Links: ACL Anthology


Awards & Scholarships

Personal