Most exploratory data analysis tools generate static reports.
You upload a dataset, get dozens of charts, scroll for a few minutes, and leave with information overload instead of actual insight.
After running into this problem repeatedly, I decided to build something different.
So I open sourced XAdaptiveEDA.
A Python + Streamlit tool that adapts its recommendations based on how you interact with your data.
GitHub: https://github.com/AshayK003/XadaptiveEDA
What Makes It Different?
Traditional EDA tools treat every dataset and every user the same way.
XAdaptiveEDA tries to behave more like an adaptive system instead of a one-time report generator.
You upload a CSV, Excel, or JSON file, and the app:
ranks analyses by relevance
tracks your feedback with 👍 and 👎 interactions
adapts future recommendations in real time
avoids repetitive analyses
prioritizes columns and patterns you explore frequently
lets you chat with your dataset using natural language
The goal was to make exploratory data analysis feel more interactive and personalized.
Features
Current capabilities include:
Core Analysis
Distribution analysis
Correlation analysis
Missing value detection
Outlier analysis
Categorical analysis
Time series analysis
Clustering
Feature importance
Adaptive Recommendation Engine
The recommendation engine combines:
data relevance
user preferences
novelty scoring
diversity penalties
temporal decay
affinity tracking
ε-greedy exploration
Instead of dumping every possible chart, the tool tries to surface the analyses most likely to matter.
Built-in AI Features
I also added optional LLM integration for:
chatting with datasets
AI-generated analysis insights
smart column naming
natural language query classification
Supported providers:
Ollama (local-first)
OpenRouter
Groq
Custom APIs
One thing I cared about heavily was privacy.
If you use Ollama locally, your data never leaves your machine.
Tech Stack
The project is intentionally lightweight.
Built with:
Streamlit
Plotly
pandas
NumPy
SQLite
Ollama
No massive infrastructure setup required.
The entire system currently runs with just 6 dependencies.
Engineering Details
Some things I focused on while building this:
explainable recommendation scoring
session persistence with SQLite
progressive sampling for large datasets
GPU acceleration support through Ollama
rate limiting for remote APIs
modular architecture
fully local workflows
The project currently has:
68 passing tests
MIT license
modular analysis pipeline
explainable scoring system
Why I Open Sourced It
I strongly believe useful developer tools should be accessible and hackable.
A lot of data tooling today feels either:
too enterprise-focused
too rigid
too expensive
or too opaque
I wanted to build something developers could actually inspect, extend, and experiment with.
What’s Next
Planned improvements include:
plugin system for custom analyses
exportable reports
dashboard mode
multi-dataset comparison
collaborative sessions
I also want to improve the recommendation quality and overall UX significantly.
Looking for Feedback
I’d genuinely love feedback from:
data scientists
Python developers
Streamlit builders
open source contributors
anyone working with exploratory analysis workflows
Especially around:
recommendation quality
UI/UX
adaptive scoring logic
real-world usability
GitHub:
https://github.com/AshayK003/XadaptiveEDA
If you find the project interesting, feel free to star the repo or contribute.














