Typescript
Extract API

Extract API

The Extract API extracts structured data from files or URLs using a defined schema.

Schema

Define the structure of data to extract:

const schema = {
  title: "string",
  author: "string",
  publication_year: "int",
  abstract: "string",
};

Basic Usage

async function extractData(file: File, schema: object): Promise<void> {
  const formData = new FormData();
  formData.append("files", file);
  formData.append("schema", JSON.stringify(schema));
 
  const response = await fetch(`${API_URL}/extract`, {
    method: "POST",
    headers: { "Authorization": `Bearer ${API_KEY}` },
    body: formData,
  });
 
  const reader = response.body?.getReader();
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    const chunk = new TextDecoder().decode(value);
    const lines = chunk.split("\n").filter(Boolean);
    lines.forEach(line => {
      const { result, tokens_used } = JSON.parse(line);
      console.log("Extracted data:", result);
      console.log("Tokens used:", tokens_used);
    });
  }
}

Options

  • text_only (boolean): Extract only text content
  • ai_extraction (boolean): Use AI for layout analysis
  • chunking_method (string): Content chunking method
  • ai_model (string): AI model for extraction
  • multiple_extractions (boolean): Allow multiple extractions per chunk