SmartNotes: AI-Powered Lecture Notes Summarization & Flashcard Generation

📖 Project Overview

SmartNotes is a full-stack, end-to-end production-ready web application designed to help students and professionals quickly digest lectures, meetings, and notes. It processes text (pasted or uploaded as TXT, PDF, DOCX) completely in-memory, ensuring no permanent file storage is used and 100% data privacy.

The application leverages Natural Language Processing (NLP) and Machine Learning (ML) models via Python, spaCy, NLTK, and HuggingFace Transformers to extract the most pertinent information.

✨ Features

In-Memory Processing only: Extremely secure; files and data exist only during the active session.
Multiple Input Formats: Paste raw text or upload .txt, .pdf, .docx.
Extractive Summarization: Preserves original syntax using a custom TextRank graph algorithm.
Abstractive Summarization: Uses HuggingFace Transformers (BART) to generate human-like concise summaries.
Length Control: Choose between Short (20%), Medium (40%), and Detailed (60%) summaries.
Sentence Classification: Classifies sentences into ⭐ Very Important, ✅ Key Concept, and ℹ️ Supporting Information.
Keyword & NER Extraction: Identifies the most relevant terms via YAKE and Named Entity Recognition (spaCy).
Automated Flashcards: Auto-generates Q&A flip-cards based on masked entities from the text.
Word Cloud: Dynamic base64 rendered word cloud identifying major topics.
Exporters: Download artifacts as PDF, CSV, DOCX, TXT, or Markdown—all streamed via io.BytesIO.
Premium UI: TailwindCSS, glassmorphism headers, dark mode toggling, and fully responsive layout.

🏗️ Architecture Diagram

+-----------------------------------------------------------------------------------------+
|                                    USER INTERFACE                                       |
|  [ index.html ] --> (Upload/Paste) --> [ results.html ] <-- (Flip-cards, WordCloud)     |
|         |                                     |                                         |
|    (Tailwind CSS, style.css)           (script.js logic)                                |
+---------|-------------------------------------|-----------------------------------------+
          | POST /process                       | GET /download/<type>/<format>
          v                                     v
+-----------------------------------------------------------------------------------------+
|                                     FLASK BACKEND                                       |
|  (app.py) Routes & Orchestration    [ MEMORY_STORE {session_id: results_dict} ]         |
+-------------------------------------------|---------------------------------------------+
                                            |
+-------------------------------------------|---------------------------------------------+
|                                    NLP & ML PIPELINE                                    |
|                                                                                         |
| 1. text_cleaner.py: NLTK Tokenization, Stopwords, DocX/PDF parsing stream               |
| 2. extractive.py: TextRank via NetworkX and Cosine Similarity                           |
| 3. abstractive.py: HuggingFace Pipeline (BART) chunking                                 |
| 4. keyword_extractor.py: YAKE algorithm & spaCy NER                                     |
| 5. importance.py: Heuristic TextRank scoring percentiles -> 3 importance tiers          |
| 6. flashcards.py: spaCy entity masking generator                                        |
| 7. wordcloud_generator.py: matplotlib -> io.BytesIO -> base64 string                    |
| 8. exporters.py: reportlab, python-docx, CSV exporters streamed straight to bytes       |
+-----------------------------------------------------------------------------------------+

🚀 Installation & Setup

Requirements: Python 3.11 (recommended). No database. No login. In-memory processing only.

Option A — One-command setup (recommended)

Windows (PowerShell or CMD):

setup.bat

Mac / Linux:

chmod +x setup.sh
./setup.sh

This creates a virtual environment (venv), installs all dependencies from requirements.txt (stable versions, Transformers 4.x), and downloads the spaCy model en_core_web_sm.

Option B — Manual setup

Clone or download the repository, then navigate to the project directory:
```
cd smart_notes
```

Create and activate a virtual environment:

python -m venv venv
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Download spaCy model (if not done by the app on first run):
```
python -m spacy download en_core_web_sm
```

Note: First run may download HuggingFace model distilbart-cnn-12-6 and NLTK data; this can take a few minutes.

💻 How to Run

Activate the virtual environment (if not already):

# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

Start the application:
```
python app.py
```
Open your browser at: http://127.0.0.1:5000

Verify the pipeline (no server)

To confirm that summarization and NLP modules work without starting the web app:

python test_model.py

All six pipeline steps (text cleaning, extractive & abstractive summarization, keywords, importance, word cloud) are tested.

📚 Documentation

Models and libraries — List of every model and library (Flask, spaCy, NLTK, HuggingFace, YAKE, NetworkX, WordCloud, ReportLab, python-docx, etc.) and why each is used.

📸 Sample Screenshots Description

Landing Page: A beautiful hero section featuring a pastel-indigo gradient header. Options to toggle between "Paste Text" and "Upload File" in a clean glass-effect card. Length control slider positioned clearly. Top right features a Dark Mode toggle (Moon/Sun icon).
Analysis View (Results):
- Top banner metrics show Original Format (words), Summarized (words), and Reduction (%).
- Left side: Scrollable cards displaying the original source text, and below it, an Analysis Section categorizing extracted sentences by "⭐ Very Important", "✅ Key Concept", etc., with beautifully colored badges.
- Right side: The Summaries panel combining Extractive and Abstractive approaches (controlled by aesthetic toggle buttons). Below it, Flashcards rendered as interactive 3D CSS flip-cards, alongside the generated Word Cloud image and pill-styled Keywords.

🔮 Future Improvements

~~Implement WebSocket streaming for the abstractive summarizer so the user sees text generating in real-time.~~ ✅ Done: On the results page, switch to the "Abstractive" tab and click "Watch live generation" to stream the summary chunk-by-chunk over WebSocket.
Introduce advanced LLM QA generation for more robust questions on the flashcards rather than simply masking Named Entities.
Add multi-language summarization support utilizing XLM-RoBERTa.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
static		static
templates		templates
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
setup.bat		setup.bat
setup.sh		setup.sh
test1.json		test1.json
test_model.py		test_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartNotes: AI-Powered Lecture Notes Summarization & Flashcard Generation

📖 Project Overview

✨ Features

🏗️ Architecture Diagram

🚀 Installation & Setup

Option A — One-command setup (recommended)

Option B — Manual setup

💻 How to Run

Verify the pipeline (no server)

📚 Documentation

📸 Sample Screenshots Description

🔮 Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SmartNotes: AI-Powered Lecture Notes Summarization & Flashcard Generation

📖 Project Overview

✨ Features

🏗️ Architecture Diagram

🚀 Installation & Setup

Option A — One-command setup (recommended)

Option B — Manual setup

💻 How to Run

Verify the pipeline (no server)

📚 Documentation

📸 Sample Screenshots Description

🔮 Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages