> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pdf4llm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# JSON Schema

> Full field reference for the structured output returned by [to_json()](/python/api/to_json).

<div id="apiIndicatorBadge">
  <div class="inner pymupdf" />
</div>

## Overview

`to_json()` returns a list of page objects — one per extracted page. Each page contains a list of blocks (`boxes`), and each block contains type-specific fields. This page documents every object and field in the output hierarchy.

<img src="https://mintcdn.com/artifex-e87ae94c/pzs2KzBbIyE6CrRb/images/json-schema.svg?fit=max&auto=format&n=pzs2KzBbIyE6CrRb&q=85&s=62d9bc5f15d6e14c58e5e8e2423ecd68" alt="PyMuPDF4LLM JSON Schema Diagram" className="mx-auto mb-0" width="680" height="1400" data-path="images/json-schema.svg" />

<Accordion title="Show full example">
  ```json theme={null}
  {
    "filename": "hello-world.pdf",
    "page_count": 2,
    "toc": [],
    "pages": [
      {
        "page_number": 1,
        "width": 595.2000122070312,
        "height": 841.9199829101562,
        "boxes": [
          {
            "x0": 72,
            "y0": 71.99996948242188,
            "x1": 334.470947265625,
            "y1": 273.3801574707031,
            "boxclass": "picture",
            "image": "images/hello-world.pdf-0001-00.png",
            "table": null,
            "textlines": []
          },
          {
            "x0": 70.69100189208984,
            "y0": 295.880126953125,
            "x1": 197.27691650390625,
            "y1": 304.62628173828125,
            "boxclass": "text",
            "image": null,
            "table": null,
            "textlines": [
              {
                "bbox": [
                  70.69100189208984,
                  295.880126953125,
                  197.27691650390625,
                  304.62628173828125
                ],
                "spans": [
                  {
                    "size": 12,
                    "flags": 0,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Arial",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "Hello World!",
                    "origin": [
                      70.69100189208984,
                      304.469970703125
                    ],
                    "bbox": [
                      70.69100189208984,
                      295.880126953125,
                      136.09201049804688,
                      304.610595703125
                    ],
                    "line": 0,
                    "block": 0,
                    "dir": [
                      1,
                      0
                    ]
                  },
                  {
                    "size": 12,
                    "flags": 20,
                    "bidi": 0,
                    "char_flags": 24,
                    "font": "MinionPro-Bold",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "This is bold",
                    "origin": [
                      138.8310089111328,
                      304.469970703125
                    ],
                    "bbox": [
                      138.8310089111328,
                      296.0342712402344,
                      197.27691650390625,
                      304.62628173828125
                    ],
                    "line": 0,
                    "block": 0,
                    "dir": [
                      1,
                      0
                    ]
                  }
                ]
              }
            ]
          }
        ],
        "full_ocred": false,
        "text_ocred": false,
        "fulltext": [
          {
            "type": 0,
            "number": 0,
            "flags": 0,
            "bbox": [
              70.69100189208984,
              295.880126953125,
              197.27691650390625,
              304.62628173828125
            ],
            "lines": [
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 0,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Arial",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "Hello World!",
                    "origin": [
                      70.69100189208984,
                      304.469970703125
                    ],
                    "bbox": [
                      70.69100189208984,
                      295.880126953125,
                      136.09201049804688,
                      304.610595703125
                    ],
                    "line": 0,
                    "block": 0,
                    "dir": [
                      1,
                      0
                    ]
                  },
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "MinionPro-Regular",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": " ",
                    "origin": [
                      136.09201049804688,
                      304.469970703125
                    ],
                    "bbox": [
                      136.09201049804688,
                      304.469970703125,
                      138.81600952148438,
                      304.469970703125
                    ]
                  },
                  {
                    "size": 12,
                    "flags": 20,
                    "bidi": 0,
                    "char_flags": 24,
                    "font": "MinionPro-Bold",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "This is bold",
                    "origin": [
                      138.8310089111328,
                      304.469970703125
                    ],
                    "bbox": [
                      138.8310089111328,
                      296.0342712402344,
                      197.27691650390625,
                      304.62628173828125
                    ],
                    "line": 0,
                    "block": 0,
                    "dir": [
                      1,
                      0
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  70.69100189208984,
                  295.880126953125,
                  197.27691650390625,
                  304.62628173828125
                ]
              }
            ]
          }
        ],
        "words": [],
        "links": []
      },
      {
        "page_number": 2,
        "width": 595.2000122070312,
        "height": 841.9199829101562,
        "boxes": [
          {
            "x0": 72,
            "y0": 72,
            "x1": 524,
            "y1": 118,
            "boxclass": "table",
            "image": null,
            "table": {
              "bbox": [
                71.15000104904175,
                72.19200134277344,
                523.219970703125,
                117.67998962402343
              ],
              "row_count": 3,
              "col_count": 4,
              "cells": [
                [
                  [
                    71.15000104904175,
                    72.19200134277344,
                    184.60000038146973,
                    87.3599967956543
                  ],
                  [
                    184.60000038146973,
                    72.19200134277344,
                    297.1599922180176,
                    87.3599967956543
                  ],
                  [
                    297.1599922180176,
                    72.19200134277344,
                    409.96001052856445,
                    87.3599967956543
                  ],
                  [
                    409.96001052856445,
                    72.19200134277344,
                    523.219970703125,
                    87.3599967956543
                  ]
                ],
                [
                  [
                    71.15000104904175,
                    87.3599967956543,
                    184.60000038146973,
                    102.4799919128418
                  ],
                  [
                    184.60000038146973,
                    87.3599967956543,
                    297.1599922180176,
                    102.4799919128418
                  ],
                  [
                    297.1599922180176,
                    87.3599967956543,
                    409.96001052856445,
                    102.4799919128418
                  ],
                  [
                    409.96001052856445,
                    87.3599967956543,
                    523.219970703125,
                    102.4799919128418
                  ]
                ],
                [
                  [
                    71.15000104904175,
                    102.4799919128418,
                    184.60000038146973,
                    117.67998962402343
                  ],
                  [
                    184.60000038146973,
                    102.4799919128418,
                    297.1599922180176,
                    117.67998962402343
                  ],
                  [
                    297.1599922180176,
                    102.4799919128418,
                    409.96001052856445,
                    117.67998962402343
                  ],
                  [
                    409.96001052856445,
                    102.4799919128418,
                    523.219970703125,
                    117.67998962402343
                  ]
                ]
              ],
              "extract": [
                [
                  "A",
                  "B",
                  "C",
                  "D"
                ],
                [
                  "A1",
                  "B1",
                  "C1",
                  "D1"
                ],
                [
                  "A2",
                  "B2",
                  "C2",
                  "D2"
                ]
              ],
              "markdown": "|A|B|C|D|\n|---|---|---|---|\n|A1|B1|C1|D1|\n|A2|B2|C2|D2|\n\n"
            },
            "textlines": null
          }
        ],
        "full_ocred": false,
        "text_ocred": false,
        "fulltext": [
          {
            "type": 0,
            "number": 0,
            "flags": 0,
            "bbox": [
              77.76000213623047,
              75.767822265625,
              426.34820556640625,
              83.865478515625
            ],
            "lines": [
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "A ",
                    "origin": [
                      77.76000213623047,
                      83.760009765625
                    ],
                    "bbox": [
                      77.76000213623047,
                      75.873291015625,
                      87.26829528808594,
                      83.760009765625
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  77.76000213623047,
                  75.873291015625,
                  87.26829528808594,
                  83.760009765625
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "B ",
                    "origin": [
                      190.32000732421875,
                      83.760009765625
                    ],
                    "bbox": [
                      190.32000732421875,
                      75.873291015625,
                      200.0041046142578,
                      83.760009765625
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  190.32000732421875,
                  75.873291015625,
                  200.0041046142578,
                  83.760009765625
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "C ",
                    "origin": [
                      303.1199951171875,
                      83.760009765625
                    ],
                    "bbox": [
                      303.1199951171875,
                      75.767822265625,
                      313.86480712890625,
                      83.865478515625
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  303.1199951171875,
                  75.767822265625,
                  313.86480712890625,
                  83.865478515625
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "D ",
                    "origin": [
                      415.67999267578125,
                      83.760009765625
                    ],
                    "bbox": [
                      415.67999267578125,
                      75.873291015625,
                      426.34820556640625,
                      83.760009765625
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  415.67999267578125,
                  75.873291015625,
                  426.34820556640625,
                  83.760009765625
                ]
              }
            ]
          },
          {
            "type": 0,
            "number": 11,
            "flags": 0,
            "bbox": [
              77.76000213623047,
              90.8878173828125,
              432.7583923339844,
              98.9854736328125
            ],
            "lines": [
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "A1 ",
                    "origin": [
                      77.76000213623047,
                      98.8800048828125
                    ],
                    "bbox": [
                      77.76000213623047,
                      90.9932861328125,
                      93.67839813232422,
                      98.8800048828125
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  77.76000213623047,
                  90.9932861328125,
                  93.67839813232422,
                  98.8800048828125
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "B1 ",
                    "origin": [
                      190.32000732421875,
                      98.8800048828125
                    ],
                    "bbox": [
                      190.32000732421875,
                      90.9932861328125,
                      206.414306640625,
                      98.8800048828125
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  190.32000732421875,
                  90.9932861328125,
                  206.414306640625,
                  98.8800048828125
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "C1 ",
                    "origin": [
                      303.1199951171875,
                      98.8800048828125
                    ],
                    "bbox": [
                      303.1199951171875,
                      90.8878173828125,
                      320.2749938964844,
                      98.9854736328125
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  303.1199951171875,
                  90.8878173828125,
                  320.2749938964844,
                  98.9854736328125
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "D1 ",
                    "origin": [
                      415.67999267578125,
                      98.8800048828125
                    ],
                    "bbox": [
                      415.67999267578125,
                      90.9932861328125,
                      432.7583923339844,
                      98.8800048828125
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  415.67999267578125,
                  90.9932861328125,
                  432.7583923339844,
                  98.8800048828125
                ]
              }
            ]
          },
          {
            "type": 0,
            "number": 22,
            "flags": 0,
            "bbox": [
              77.76000213623047,
              106.0078125,
              432.7583923339844,
              114.10546875
            ],
            "lines": [
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "A2 ",
                    "origin": [
                      77.76000213623047,
                      114
                    ],
                    "bbox": [
                      77.76000213623047,
                      106.11328125,
                      93.67839813232422,
                      114
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  77.76000213623047,
                  106.11328125,
                  93.67839813232422,
                  114
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "B2 ",
                    "origin": [
                      190.32000732421875,
                      114
                    ],
                    "bbox": [
                      190.32000732421875,
                      106.11328125,
                      206.414306640625,
                      114
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  190.32000732421875,
                  106.11328125,
                  206.414306640625,
                  114
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "C2 ",
                    "origin": [
                      303.1199951171875,
                      114
                    ],
                    "bbox": [
                      303.1199951171875,
                      106.0078125,
                      320.2749938964844,
                      114.10546875
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  303.1199951171875,
                  106.0078125,
                  320.2749938964844,
                  114.10546875
                ]
              },
              {
                "spans": [
                  {
                    "size": 12,
                    "flags": 4,
                    "bidi": 0,
                    "char_flags": 16,
                    "font": "Aptos",
                    "color": 0,
                    "alpha": 255,
                    "ascender": 0.800000011920929,
                    "descender": -0.20000000298023224,
                    "text": "D2 ",
                    "origin": [
                      415.67999267578125,
                      114
                    ],
                    "bbox": [
                      415.67999267578125,
                      106.11328125,
                      432.7583923339844,
                      114
                    ]
                  }
                ],
                "wmode": 0,
                "dir": [
                  1,
                  0
                ],
                "bbox": [
                  415.67999267578125,
                  106.11328125,
                  432.7583923339844,
                  114
                ]
              }
            ]
          }
        ],
        "words": [],
        "links": []
      }
    ],
    "metadata": {
      "format": "PDF 1.6",
      "title": "",
      "author": "",
      "subject": "",
      "keywords": "",
      "creator": "",
      "producer": "",
      "creationDate": "D:20240722172345Z",
      "modDate": "D:20260318153118Z",
      "trapped": "",
      "encryption": null
    }
  }
  ```
</Accordion>

The extraction response is a single JSON object describing a parsed PDF — its pages, text content, tables, images, and metadata. This page documents every object and field in that structure with positional data.

<Note>
  Positional coordinates are in PDF points (1 point = 1/72 inch). The origin `(0, 0)` is the **top-left** corner of the page.
</Note>

## Root object

The top-level object returned for every extraction.

<Accordion title="Example">
  ```json theme={null}
  {
    "filename": "hello-world.pdf",
    "page_count": 2,
    "toc": [],
    "pages": [...],
    "metadata": {...}
  }
  ```
</Accordion>

<ParamField body="filename" type="string">
  The name of the source PDF file that was parsed.
</ParamField>

<ParamField body="page_count" type="number">
  Total number of pages in the PDF.
</ParamField>

<ParamField body="toc" type="array">
  Table of contents entries extracted from the PDF. Each entry is a tuple of
  `[page_index, title, page_number]`. Empty when the PDF has no bookmarks or
  outline.
</ParamField>

<ParamField body="pages" type="array">
  Array of [page objects](#page-object), one per page in the PDF.
</ParamField>

<ParamField body="metadata" type="object">
  PDF document metadata. See [metadata object](#metadata-object).
</ParamField>

***

## Page object

Represents a single page of the PDF. Found in `pages[]`.

<Accordion title="Example">
  ```json theme={null}
  {
    "page_number": 1,
    "width": 595.2,
    "height": 841.92,
    "boxes": [...],
    "fulltext": [...],
    "full_ocred": false,
    "text_ocred": false,
    "words": [],
    "links": []
  }
  ```
</Accordion>

<ParamField body="page_number" type="number">
  1-based index of this page within the document.
</ParamField>

<ParamField body="width" type="number">
  Page width in PDF user units (points). A standard A4 page is 595.28 pt wide.
</ParamField>

<ParamField body="height" type="number">
  Page height in PDF user units (points). A standard A4 page is 841.89 pt tall.
</ParamField>

<ParamField body="boxes" type="array">
  Detected content regions on the page. Each entry is a [box object](#box-object).
  Boxes may be classified as `text`, `picture`, or `table`.
</ParamField>

<ParamField body="fulltext" type="array">
  Raw text blocks extracted directly from the PDF's content stream, independent
  of the box layout. Each entry is a [fulltext block](#fulltext-block). This
  mirrors the logical reading order as encoded in the PDF.
</ParamField>

<ParamField body="full_ocred" type="boolean">
  `true` if the entire page was processed through OCR because no native text
  layer was found.
</ParamField>

<ParamField body="text_ocred" type="boolean">
  `true` if individual text regions were OCR'd (as opposed to full-page OCR).
</ParamField>

<ParamField body="words" type="array">
  Word-level bounding boxes. Empty in this format variant.
</ParamField>

<ParamField body="links" type="array">
  Hyperlinks found on the page. Empty when no links are present.
</ParamField>

***

## Box object

A detected content region on a page. Found in `pages[].boxes[]`.

Boxes are the primary layout unit.

Each box covers a rectangular area and is classified into one of these types:

```text theme={null}
    text
    picture
    table
    caption
    title
    section-header
    page-header
    page-footer
    list-item
    footnote
    formula
```

<AccordionGroup>
  <Accordion title="Text box example">
    ```json theme={null}
    {
      "x0": 70.69,
      "y0": 295.88,
      "x1": 197.28,
      "y1": 304.63,
      "boxclass": "text",
      "image": null,
      "table": null,
      "textlines": [...]
    }
    ```
  </Accordion>

  <Accordion title="Picture box example">
    ```json theme={null}
    {
      "x0": 72,
      "y0": 72,
      "x1": 334.47,
      "y1": 273.38,
      "boxclass": "picture",
      "image": "images/hello-world.pdf-0001-00.png",
      "table": null,
      "textlines": []
    }
    ```
  </Accordion>

  <Accordion title="Table box example">
    ```json theme={null}
    {
      "x0": 72,
      "y0": 72,
      "x1": 524,
      "y1": 118,
      "boxclass": "table",
      "image": null,
      "table": {...},
      "textlines": null
    }
    ```
  </Accordion>
</AccordionGroup>

<ParamField body="x0" type="number">
  Left edge of the box in PDF points, measured from the left of the page.
</ParamField>

<ParamField body="y0" type="number">
  Top edge of the box in PDF points, measured from the top of the page.
</ParamField>

<ParamField body="x1" type="number">
  Right edge of the box in PDF points.
</ParamField>

<ParamField body="y1" type="number">
  Bottom edge of the box in PDF points.
</ParamField>

<ParamField body="boxclass" type="string">
  Classification of the content region. One of:

  * `"text"` — contains text lines and spans
  * `"picture"` — contains an embedded image
  * `"table"` — contains a detected table structure
</ParamField>

<ParamField body="image" type="string | null">
  Relative path to the extracted image file when `boxclass` is `"picture"`.
  `null` for all other box types.
</ParamField>

<ParamField body="table" type="object | null">
  A [table object](#table-object) when `boxclass` is `"table"`. `null` for all
  other box types.
</ParamField>

<ParamField body="textlines" type="array | null">
  Array of [textline objects](#textline-object) when `boxclass` is `"text"`.
  Empty array `[]` for picture boxes. `null` for table boxes.
</ParamField>

***

## Table object

Structured data for a detected table. Found in `boxes[].table` when
`boxclass` is `"table"`.

<Accordion title="Example">
  ```json theme={null}
  {
    "bbox": [71.15, 72.19, 523.22, 117.68],
    "row_count": 3,
    "col_count": 4,
    "cells": [
      [[71.15, 72.19, 184.6, 87.36], [184.6, 72.19, 297.16, 87.36], ...],
      ...
    ],
    "extract": [
      ["A", "B", "C", "D"],
      ["A1", "B1", "C1", "D1"],
      ["A2", "B2", "C2", "D2"]
    ],
    "markdown": "|A|B|C|D|\n|---|---|---|---|\n|A1|B1|C1|D1|\n|A2|B2|C2|D2|\n\n"
  }
  ```
</Accordion>

<ParamField body="bbox" type="number[4]">
  Bounding box of the entire table as `[x0, y0, x1, y1]` in PDF points.
</ParamField>

<ParamField body="row_count" type="number">
  Number of rows in the table, including any header row.
</ParamField>

<ParamField body="col_count" type="number">
  Number of columns in the table.
</ParamField>

<ParamField body="cells" type="array">
  A 3D array of cell bounding boxes: `cells[row][col]` gives `[x0, y0, x1, y1]`
  for that cell in PDF points. Useful for mapping extracted text back to exact
  cell positions on the page.
</ParamField>

<ParamField body="extract" type="array">
  A 2D array of the cell text values: `extract[row][col]` gives the string
  content of that cell. The first row is typically the header row.
</ParamField>

<ParamField body="markdown" type="string">
  The table rendered as a Markdown pipe table string, ready for display or
  further processing.
</ParamField>

***

## Textline object

A single line of text within a box. Found in `boxes[].textlines[]`.

<Accordion title="Example">
  ```json theme={null}
  {
    "bbox": [70.69, 295.88, 197.28, 304.63],
    "spans": [...]
  }
  ```
</Accordion>

<ParamField body="bbox" type="number[4]">
  Bounding box of this text line as `[x0, y0, x1, y1]` in PDF points.
</ParamField>

<ParamField body="spans" type="array">
  Array of [span objects](#span-object). A single line is typically split into
  multiple spans wherever the font, size, or style changes.
</ParamField>

***

## Span object

The smallest unit of text, sharing a single consistent style. Found in
`textlines[].spans[]` and `fulltext[].lines[].spans[]`.

A span break occurs at any change of font, size, weight, colour, or style — so
a line reading "Hello World! **This is bold**" would produce
two separate spans. See [Font Flags Reference](/python/guides/extract-JSON#font-flags-reference) for how to interpret the `flags` field.

<Accordion title="Example — regular text">
  ```json theme={null}
  {
    "size": 12,
    "flags": 0,
    "bidi": 0,
    "char_flags": 16,
    "font": "Arial",
    "color": 0,
    "alpha": 255,
    "ascender": 0.8,
    "descender": -0.2,
    "text": "Hello World!",
    "origin": [70.69, 304.47],
    "bbox": [70.69, 295.88, 136.09, 304.61],
    "line": 0,
    "block": 0,
    "dir": [1, 0]
  }
  ```
</Accordion>

<Accordion title="Example — bold text">
  ```json theme={null}
  {
    "size": 12,
    "flags": 16,
    "bidi": 0,
    "char_flags": 24,
    "font": "MinionPro-Bold",
    "color": 0,
    "alpha": 255,
    "ascender": 0.8,
    "descender": -0.2,
    "text": "This is bold",
    "origin": [138.83, 304.47],
    "bbox": [138.83, 296.03, 197.28, 304.63],
    "line": 0,
    "block": 0,
    "dir": [1, 0]
  }
  ```
</Accordion>

<ParamField body="text" type="string">
  The actual text content of this span.
</ParamField>

<ParamField body="font" type="string">
  Full PostScript font name, e.g. `"Arial"`, `"MinionPro-Bold"`, `"Aptos"`.
  The font name often encodes weight and style (e.g. `-Bold`, `-It`).
</ParamField>

<ParamField body="size" type="number">
  Font size in points.
</ParamField>

<ParamField body="flags" type="number">
  Bitmask of font style flags from the PDF spec. Common values:

  * `0` — regular
  * `4` — italic (bit 2)
  * `16` — bold (bit 4)
  * `20` — bold + italic (bits 2 and 4)
</ParamField>

<ParamField body="char_flags" type="number">
  Additional character flags - please refer to [this enumeration](https://github.com/ArtifexSoftware/mupdf/blob/66ef5879c18bc7cc0831fd9b915b257ab717b79e/include/mupdf/fitz/structured-text.h#L489) for details.
</ParamField>

<ParamField body="color" type="number">
  Text colour as a packed RGB integer. `0` is black (`#000000`).
</ParamField>

<ParamField body="alpha" type="number">
  Opacity of the text, from `0` (transparent) to `255` (fully opaque).
</ParamField>

<ParamField body="ascender" type="number">
  Font ascender as a fraction of the font size. Typically `0.8`, meaning the
  ascender reaches 80% of the em above the baseline.
</ParamField>

<ParamField body="descender" type="number">
  Font descender as a fraction of the font size. Typically `-0.2`, meaning the
  descender extends 20% of the em below the baseline.
</ParamField>

<ParamField body="bbox" type="number[4]">
  Tight bounding box of the rendered glyphs as `[x0, y0, x1, y1]` in PDF points.
</ParamField>

<ParamField body="origin" type="number[2]">
  The text origin point `[x, y]` — the position of the baseline at the start
  of the span, in PDF points.
</ParamField>

<ParamField body="bidi" type="number">
  Unicode bidirectional level. `0` for left-to-right text.
</ParamField>

<ParamField body="line" type="number">
  Index of the line this span belongs to within its parent block.
</ParamField>

<ParamField body="block" type="number">
  Index of the block this span belongs to within the page's content stream.
</ParamField>

<ParamField body="dir" type="number[2]">
  Text direction as a unit vector `[x, y]`. `[1, 0]` is standard
  left-to-right horizontal text. `[0, -1]` would indicate top-to-bottom
  vertical text.
</ParamField>

***

## Fulltext block

A raw text block from the PDF content stream, independent of visual layout.
Found in `pages[].fulltext[]`.

The `fulltext` array captures text in the order it appears in the PDF's
internal stream, which may differ from the visual reading order. Each block
contains one or more lines, and each line contains spans.

<Accordion title="Example">
  ```json theme={null}
  {
    "type": 0,
    "number": 0,
    "flags": 0,
    "bbox": [70.69, 295.88, 197.28, 304.63],
    "lines": [
      {
        "spans": [...],
        "wmode": 0,
        "dir": [1, 0],
        "bbox": [70.69, 295.88, 197.28, 304.63]
      }
    ]
  }
  ```
</Accordion>

<ParamField body="type" type="number">
  Block type from the PDF spec. `0` indicates a text block.
</ParamField>

<ParamField body="number" type="number">
  Sequential index of this block within the page's content stream.
</ParamField>

<ParamField body="flags" type="number">
  Block-level flags. `0` for standard text blocks.
</ParamField>

<ParamField body="bbox" type="number[4]">
  Bounding box of the entire block as `[x0, y0, x1, y1]` in PDF points.
</ParamField>

<ParamField body="lines" type="array">
  Array of line objects within this block. Each line has:

  * `spans` — array of [span objects](#span-object)
  * `wmode` — writing mode (`0` = horizontal, `1` = vertical)
  * `dir` — line direction vector, e.g. `[1, 0]` for left-to-right
  * `bbox` — bounding box of the line as `[x0, y0, x1, y1]`
</ParamField>

***

## Metadata object

PDF document-level metadata. Found at the root as `metadata`.

<Accordion title="Example">
  ```json theme={null}
  {
    "format": "PDF 1.6",
    "title": "",
    "author": "",
    "subject": "",
    "keywords": "",
    "creator": "",
    "producer": "",
    "creationDate": "D:20240722172345Z",
    "modDate": "D:20260318153118Z",
    "trapped": "",
    "encryption": null
  }
  ```
</Accordion>

<ParamField body="format" type="string">
  PDF version string, e.g. `"PDF 1.4"` or `"PDF 1.6"`.
</ParamField>

<ParamField body="title" type="string">
  Document title as set in the PDF's document properties. Empty string if not set.
</ParamField>

<ParamField body="author" type="string">
  Document author as set in the PDF's document properties. Empty string if not set.
</ParamField>

<ParamField body="subject" type="string">
  Document subject. Empty string if not set.
</ParamField>

<ParamField body="keywords" type="string">
  Keywords associated with the document. Empty string if not set.
</ParamField>

<ParamField body="creator" type="string">
  The application that originally created the document (before any PDF
  conversion), e.g. `"Microsoft Word"`. Empty string if not set.
</ParamField>

<ParamField body="producer" type="string">
  The application that produced or last saved the PDF file, e.g.
  `"macOS Quartz PDFContext"`. Empty string if not set.
</ParamField>

<ParamField body="creationDate" type="string">
  Creation timestamp in PDF date format: `D:YYYYMMDDHHmmSSOHH'mm'`.
  Example: `"D:20240722172345Z"` = 22 July 2024, 17:23:45 UTC.
</ParamField>

<ParamField body="modDate" type="string">
  Last modification timestamp in the same PDF date format.
</ParamField>

<ParamField body="trapped" type="string">
  PDF trapping status. Rarely set in practice; empty string if not applicable.
</ParamField>

<ParamField body="encryption" type="string | null">
  Encryption details if the PDF is encrypted. `null` for unencrypted documents.
</ParamField>

## See Also

<CardGroup cols={2}>
  <Card title="Chunk Schema" icon="layer-group" href="/python/reference/chunk-schema">
    Schema for `page_chunks=True` output from `to_markdown()`.
  </Card>

  <Card title="Extract JSON Guide" icon="brackets-curly" href="/python/guides/extract-JSON">
    Working walkthrough with filtering and pipeline examples.
  </Card>

  <Card title="to_json()" icon="code" href="/python/api/to_json">
    Full API reference for to\_json().
  </Card>

  <Card title="Tables Guide" icon="table" href="/python/guides/tables">
    Extracting and working with table blocks.
  </Card>
</CardGroup>
