Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Categorizing online harassment on Twitter

Published in ECML PKDD, 2019

This paper is about harassment on social media and how categorize it using machine learning and NLP techniques.

Recommended citation: Saeidi, Mozhgan and da S. Sousa, Samuel Bruno and Milios, Evangelos and Zeh, Norbert and Berton, Lilian. (2019). "Categorizing online harassment on Twitter." Booktitle: Machine Learning and Knowledge Discovery in Databases: International Workshops of ECML PKDD 2019, W{\"u}rzburg, Germany, September 16--20, 2019, Proceedings, Part II. pages={283--297}, year={2020}, organization={Springer} Journal 1. 1(1). https://link.springer.com/chapter/10.1007/978-3-030-43887-6_22

Table of contents detection in financial documents

Published in COLING, 2020

This paper is about specific charactristics in detecting document structure.

Recommended citation: Kosmajac, Dijana and Saeidi, Mozhgan and Taylor, Stacey. (2020). "Table of contents detection in financial documents." booktitle={Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation}, pages={169--173}, year={2020} Journal 1. 1(2). https://aclanthology.org/2020.fnp-1.29/

Graph representation learning in document wikification

Published in ICDAR, 2021

This paper is about Wikification task and how design a new embedding approach to improve the final results of Wikification.

Recommended citation: Saeidi, Mozhgan and Milios, Evangelos and Zeh, Norbert. (2021). " booktitle={Document Analysis and Recognition--ICDAR 2021 Workshops: Lausanne, Switzerland, September 5--10, 2021, Proceedings, Part II 16}, pages={509--524}, year={2021}, organization={Springer} Journal 1. 1(3). https://link.springer.com/chapter/10.1007/978-3-030-86159-9_37

Contextualized knowledge base sense embeddings in word sense disambiguation

Published in ICDAR, 2021

This paper is about the role of contextual embeddings in the word sense disambiguation task.

Recommended citation: Saeidi, Mozhgan and Milios, Evangelos and Zeh, Norbert. (2021). " booktitle={Document Analysis and Recognition--ICDAR 2021 Workshops: Lausanne, Switzerland, September 5--10, 2021, Proceedings, Part II 16}, pages={174--186}, year={2021}, organization={Springer} Journal 1. 1(3). https://link.springer.com/chapter/10.1007/978-3-030-86159-9_37

Graph Convolutional Networks for Categorizing Online Harassment on Twitter

Published in ICMLA, 2021

This paper is about using a graph convolutional network in categorozong online harassment in twitter posts.

Recommended citation: Saeidi, Mozhgan and Milios, Evangelos and Zeh, Norbert. (2021). " booktitle={2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)}, pages={946--951}, year={2021}, organization={IEEE} Journal 1. 1(1). https://ieeexplore.ieee.org/abstract/document/9680133

ContextBERT: Contextual Graph Representation Learning in Text Disambiguation

Published in PKDD, 2021

This paper is about solving word sense disambiguation task using graph convolutional networks.

Recommended citation: Saeidi, Mozhgan. (2021). " ContextBERT: Contextual Graph Representation Learning in Text Disambiguation." Booktitle: PKDD, Germany, September 16--20, 2019, Proceedings, Part II. pages={283--297}, year={2021}, organization={Springer} Journal 1. 1(1). https://ceur-ws.org/Vol-2997/paper2.pdf

Biomedical Word Sense Disambiguation with Contextualized Representation Learning

Published in WWW, 2022

This paper is about using contextual representation learning when disambiguating biomedical text.

Recommended citation: Saeidi, Mozhgan and Milios, Evangelos and Zeh, Norbert. (2019). " Biomedical Word Sense Disambiguation with Contextualized Representation Learning." booktitle={Companion Proceedings of the Web Conference 2022}, pages={843--848}, year={2022} Journal 1. 1(1). https://dl.acm.org/doi/abs/10.1145/3487553.3524703

Ruler Wrapping

Published in International Journal of Computational Geometry & Applications, 2022

This paper is about ruler wrapping problem.

Recommended citation: Gagie, Travis and Saeidi, Mozhgan and Sapucaia, Allan. (2022). " journal={International Journal of Computational Geometry \& Applications}, pages={1--10}, year={2022}, publisher={World Scientific} Journal 1. 1(1). https://arxiv.org/pdf/2109.14497.pdf

Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset

Published in International Journal of Computational Geometry & Applications, 2022

This work explores how well large language models can solve creative problems by analyzing their performance on a dataset inspired by a quiz show segment that involves making connections between seemingly unrelated words.

Recommended citation: Naeini, Saeid and Saqur, Raeid and Saeidi, Mozhgan and Giorgi, John and Taati, Babak. (2023). " journal={journal={Advances in Neural Information Processing Systems}}, pages={100--130}, year={2023}, publisher={NeurIPS} Journal 1. 1(1). [https://arxiv.org/pdf/2109.14497.pdf](https://proceedings.neurips.cc/paper_files/paper/2023/file/11e3e0f1b29dcd31bd0952bfc1357f68-Paper-Datasets_and_Benchmarks.pdf)

research

talks

Biomedical Word Sense Disambiguation with Contextualized Representation Learning

Published:

Representation learning is an important component in solving most Natural Language Process- ing (NLP) problems, including Word Sense Disambiguation (WSD). The WSD task tries to find the best meaning in a knowledge base for a word with multiple meanings (ambiguous word). WSD methods choose this best meaning based on the context, i.e., the words around the am- biguous word in the input text document. Thus, word representations may improve the effec- tiveness of the disambiguation models if they carry useful information from the context and the knowledge base. Most of the current representation learning approaches are that they are mostly trained on the general English text and are not domain specified. In this paper, we present a novel contextual-knowledge base aware sense representation method in the biomedical domain. The novelty in our representation is the integration of the knowledge base and the context. This representation lies in a space comparable to that of contextualized word vectors, thus allowing a word occurrence to be easily linked to its meaning by applying a simple nearest neighbor ap- proach. Comparing our approach with state-of-the-art methods shows the effectiveness of our method in terms of text coherence

Ruler Wrapping

Published:

In 1985 Hopcroft, Joseph and Whitesides showed it is NP-complete to decide whether a car- penter’s ruler with segments of given positive lengths can be folded into an interval of at most a given length, such that the folded hinges alternate between 180 degrees clockwise and 180 degrees counter-clockwise. At the open-problem session of 33rd Canadian Conference on Com- putational Geometry (CCCG ’21), O’Rourke proposed a natural variation of this problem called ruler wrapping, in which all folded hinges must be folded the same way. In this paper we show O’Rourke’s variation has a linear-time solution

CONTEXT-AWARE SEMANTIC TEXT MINING AND REPRESENTATION LEARNING FOR TEXT DISAMBIGUATION AND ONLINE HARASSMENT CLASSIFICATION

Published:

Our contribution can be divided into two main parts; the first part focuses on the text ambiguity problem, and the second part focuses on the text classification problem, which are two related tasks in NLP. While analyzing and designing algorithms for text understanding and representation learning, we introduce algorithms to better understand the text and its exact meaning when there are different possible meanings for words present in the text. This problem has been known in NLP as a Word Sense Disambiguation (WSD) problem. In the first part of this thesis, we analyze the effect of different current available methods in text embedding on the WSD task, and based on the observations and experiments, we introduce a new method for text representation learning. In addition to general English text, to evaluate our method, we analyze the effect of our representation on Biomedical text as an application. This analysis shows how effective these embeddings are in capturing the context when we are looking to find the correct meaning behind the words in biomedical texts. We also investigate the problem of text classification in this study. Text classification is one other relevant problem in NLP to the problem of WSD. We consider a collection of tweet posts and try to classify them into two classes, one if a tweet includes harassment, and the other is the class of tweets without harassment. We apply classical machine learning approaches and show the effects and differences between them. In addition to this binary classification investigation, we focus on the first class, the tweets including harassment. We analyze the tweets and classify which type of harassment is a tweet using classical machine learning approaches, including logistic regression, Gaussian naive Bayes, decision trees, support vector machines, random forest, multi-layer perceptron, and AdaBoost in chapter five. The last chapter uses a deep learning approach, the graph convolution approach, to solve this problem. Our experiments show how effective using this deep learning method is compared to the previous classical machine learning approache

Pensive Project

Published:

Our project was developing and enhancing a recommendation system working with the Reddit dataset. Later, we used GitHub dataset as well to train our models. In our model, we used neural language models to enhance the performance of the vector representations.

LLMs Clinical Trial Matching

Published:

Matching patients to clinical trials is a major hurdle in medical research. Current methods, relying on manual screening of medical records, are slow and expensive. This talk explores the potential of Large Language Models (LLMs) to automate this process.

teaching

Teaching Assistant

Instructor and Teaching Assistant, Departments of Computer Science and Mathematics, Dalhousie University, 2022

Courses: Statistical Data Mining, Machine Learning, Deep Learning, Natural Language Processing, Linear Algebra, Serverless Data Processing, Principles of Computer Programming, Cloud Computing, Design and Analysis of Algorithms, Data Structures and Algorithms, Graph Theory, Discrete Mathematics, Graph MAchine Learning Algorithms.

Teaching Assistant

Workshop, Vector Institute, UHN, and the University of Toronto, 2022

I was a teaching assistant for the course “Clinicians Champions in AI”, from May to July, and from July to October, I was a teaching assistant for the course “NRCan AI Fundamentals”.