Skip to content
@ucbepic

EPIC Data Lab

Effective Programming Interaction and Computation with Data

Popular repositories Loading

  1. docetl docetl Public

    A system for agentic LLM-powered data processing and ETL

    Python 2k 191

  2. TWIX TWIX Public

    TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by inferring the shared underlying visual template across documents

    Python 185 8

  3. BARGAIN BARGAIN Public

    Low-Cost LLM-Powered Data Processing with Theoretical Guarantees

    Python 18 2

  4. pdf_parser pdf_parser Public

    Parse PDFs using computer vision, layout analysis, and other state-of-the-art document intelligence techniques. WebApp implemented in Flask/Jinja2 with infer and train pipelines managed by FlorDB

    JavaScript 7

  5. docetl-examples docetl-examples Public

    Examples of docetl pipelines

    Python 2

  6. ml_tutorial ml_tutorial Public

    Introduction to Flordb with PyTorch and TensorFlow

    Jupyter Notebook

Repositories

Showing 6 of 6 repositories

Top languages

Loading…

Most used topics

Loading…