Research


Named Entity Recognition (NER), Information Extraction (IE) and Short Text Mining.

Projects

  • StereoJokes : A Comprehensive Annotation Framework for Humor Analysis;    August 2024 – Present
    • Developing an annotation framework to identify linguistic cues in English Jokes for Humor Classification.

    • Producing annotated datasets for benchmarking fine-grained humor classification tasks in English.

  • Multipass NER : Reintroduction and Eviction in NER queues for Microblog Streams;    March – December 2023
    • Selective reintroduction of processed tweets in an iterative streaming pipeline to be collectively processed with future incoming batches for high-confidence outputs.

    • Selective eviction of tweets from NER queues upon stagnation of incoming evidence for successful NER.

  • NER Globalizer : Exploring Collective NER in Microblog Streams;    Aug 2021 – December 2022
    • Exploring Global Entity Embeddings for collectivized Entity Mention Detection, Disambiguation and Typing.

    • Tests architecturally diverse named entity taggers in a customizable pipeline with a special focus on Deep NER systems.

  • EMD Globalizer : Boosting Entity Mention Detection in Microblog Streams;    Jan 2019 - July 2021
    • A two-phase framework for continuous and iterative Entity Mention Detection (EMD) on voluminous message streams.

    • Enhances the effectiveness of existing EMD techniques with the construction of Global Contextual Embeddings.

  • TwiCS : Lightweight Entity Mention Detection in Targeted Twitter Streams;    Mar 2017 – Dec 2018
    • A lightweight, iterative EMD system with an incremental learning framework for streaming environments.

    • Exhibits 2.64 times improvement in throughput and 15% increase in F1-score than state-of-the-art systems.


Publications