FlipVQA-Miner: Multimodal Knowledge Extraction

Multimodal Knowledge Extraction Pipeline Demo

Upload textbook or exam PDFs. MinerU parses the layout and an LLM extracts structured QA pairs, outputting raw_vqa.jsonl.

Pipeline: PDF Upload → MinerU Parsing → LLM QA Extraction → Download Results

All API calls use your own keys. This Space does not store any data or keys.

PDF File(s) — single: Q&A interleaved; two files: 1st questions, 2nd answers

Task Name

API Base URL

LLM API Key (DF_API_KEY)

Model Name

MinerU API Key (MINERU_API_KEY)

⚠️ Independent from LLM key. Get yours at https://mineru.net/apiManage/token

Max Workers

1 30

Status

Download Result (vqa_output.zip — JSONL + images)