Plagiarism Checker App

Plagiarism Detection System: Ensuring Academic Integrity

This project developed a mobile app-based plagiarism detection system designed to help educators identify and flag plagiarized content in student submissions. Utilizing Natural Language Processing (NLP) techniques, the system compares assignments against existing data, providing detailed reports on potential plagiarism.

Key Features:

  • Automated Plagiarism Detection: Employs sentence embeddings and cosine similarity to compare submissions, flagging sentences exceeding a defined threshold
  • Detailed Reporting: Provides educators with flagged sentences, similarity percentages, and source information (student, course, assignment...)
  • Flexible Thresholds: Allows educators to adjust sensitivity for tailored plagiarism checks
  • Assignment Management: Enables teachers to create assignments and students to submit solutions seamlessly

Technical Implementation:

  • Core Technologies: Flask (Python backend), SQLAlchemy (database management), SentenceTransformer (all-MiniLM-L6-v2, sentence embeddings), Cosine Similarity, Clustering, NLTK (text preprocessing).
  • Process Flow:
    1. 1. Preprocessing: Text is converted to lowercase, stop words are removed, and sentences are tokenized.
      2. Embedding Generation: SentenceTransformer generates numerical embeddings representing semantic meaning.
      3. Similarity Analysis: Cosine similarity compares embeddings, and clustering groups similar sentences.
      4. Flagging: Sentences exceeding the threshold are flagged, with source information recorded.
      5. Reporting: SEducators access detailed reports with flagged content and similarity scores.

Project information