{
  "type": "Article",
  "authors": [
    {
      "type": "Person",
      "familyNames": [
        "Schubotz"
      ],
      "givenNames": [
        "Moritz"
      ]
    },
    {
      "type": "Person",
      "familyNames": [
        "Teschke"
      ],
      "givenNames": [
        "Olaf"
      ]
    }
  ],
  "description": "In this article, we give motivation for the need for standardized machine\ninterfaces to zbMATH open data, outline the target audience, describe our\npreliminary strategy to develop API interfaces, and report on the details of\nthe first interface we implemented.",
  "identifiers": [],
  "references": [
    {
      "type": "Article",
      "id": "bib-bib1",
      "authors": [],
      "title": "\nK. Hulek, F. Müller, M. Schubotz and O. Teschke,\nMathematical research data – an analysis through zbMATH references. Eur. Math.Soc. Newsl.113, 54–57 (2019)\n"
    },
    {
      "type": "Article",
      "id": "bib-bib2",
      "authors": [],
      "title": "\nK. Hulek and O. Teschke, Die Transformation von zbMATH zu einer offenen\nPlattform für die Mathematik. Mitt. Dtsch. Math.-Ver.28, 108–111 (2020)\n"
    },
    {
      "type": "Article",
      "id": "bib-bib3",
      "authors": [],
      "title": " N. Meuschke, V. Stange, M. Schubotz, M. Kramer and B. Gipp, Improving academic plagiarism detection for STEM documents by analyzing mathematical content and citations. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), Champaign, IL, USA, 120–129 (2019) "
    },
    {
      "type": "Article",
      "id": "bib-bib4",
      "authors": [],
      "title": "\nF. Müller, M. Schubotz and O. Teschke, References to research literature in QA forums – a case study of zbMATH links from MathOverflow.\nEur. Math. Soc. Newsl.114, 50–52 (2019)\n"
    },
    {
      "type": "Article",
      "id": "bib-bib5",
      "authors": [],
      "title": " M. Schubotz, D. Trautwein and O. Teschke, zbMATH is open: A practical guide to open an informationservice. Proceedings of The Open Science Conference 2021 (OSC ’21), February 17–19, 2020, online\n"
    }
  ],
  "title": "zbMATH Open: Towards standardized machine interfaces to expose bibliographic metadata",
  "meta": {},
  "content": [
    {
      "type": "Heading",
      "id": "S1",
      "depth": 1,
      "content": [
        "1 Target audience"
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p1",
      "content": [
        "As announced in the previous note,\nzbMATH is becoming open access from the 1st of January\n2021.",
        {
          "type": "Note",
          "id": "idm18",
          "noteType": "Footnote",
          "content": [
            {
              "type": "Paragraph",
              "id": "footnote1",
              "content": [
                "The open web platform is now available under the name zbMATH\nOpen, while we will address the zbMATH content and services under the\ntraditional umbrella name zbMATH for convenience."
              ]
            }
          ]
        },
        " For most working\nmathematicians, this means that they can access zbMATH from anywhere in the\nworld without a subscription or authentication. Additionally, we envision\nbenefits to the community through our efforts to connect zbMATH data with\ninformation systems of research data, collaborative platforms, funding\nagencies, and intra-disciplinary efforts, as outlined in [",
        {
          "type": "Cite",
          "target": "bib-bib2",
          "content": [
            "2"
          ]
        },
        "]. We\nexpect that our efforts to disseminate the results of mathematical research\nwill provide this research with increased visibility. However, to target\ndomain-independent information systems, we need to comply with standardized\ninformation exchange protocols and interfaces."
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p2",
      "content": [
        "In what follows, we describe potential partners that might interact with\nzbMATH. We will offer the data via so-called Application Programming\nInterfaces (APIs). Moreover, in this report, we focus on how others can make\nuse of zbMATH open data, rather than how zbMATH can use other data sources. As\ndepicted in Figure ",
        {
          "type": "Cite",
          "target": "S1-F1",
          "content": [
            "1"
          ]
        },
        ", the potential consumers can be clustered into at\nleast five groups, which we will describe below."
      ]
    },
    {
      "type": "Figure",
      "id": "S1-F1",
      "caption": [
        {
          "type": "Paragraph",
          "content": [
            "Envisioned consumer (left) and desiderata (right) [",
            {
              "type": "Cite",
              "target": "bib-bib5",
              "content": [
                "5"
              ]
            },
            "]"
          ]
        }
      ],
      "content": [
        {
          "type": "Table",
          "rows": [
            {
              "type": "TableRow",
              "cells": [
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "Emphasis",
                      "content": [
                        "Bibliographic consumer"
                      ]
                    }
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    "MathOverflow, Wikimedia, arXiv, Zotero"
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m1\" alttext=\"a_{1}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">a</mml:mi><mml:mn mathsize=\"80%\">1</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "a_{1}"
                      }
                    },
                    " Selection of individual items,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m2\" alttext=\"a_{2}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">a</mml:mi><mml:mn mathsize=\"80%\">2</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "a_{2}"
                      }
                    },
                    " High throughput,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m3\" alttext=\"a_{3}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">a</mml:mi><mml:mn mathsize=\"80%\">3</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "a_{3}"
                      }
                    },
                    " End-user-friendly formats,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m4\" alttext=\"a_{4}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">a</mml:mi><mml:mn mathsize=\"80%\">4</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "a_{4}"
                      }
                    },
                    " Various representations,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m5\" alttext=\"a_{5}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">a</mml:mi><mml:mn mathsize=\"80%\">5</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "a_{5}"
                      }
                    },
                    " Fuzzy search"
                  ]
                }
              ]
            },
            {
              "type": "TableRow",
              "cells": [
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "Emphasis",
                      "content": [
                        "Aggregators"
                      ]
                    }
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    "OpenAIRE/ERC, NFDI/DFG"
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m6\" alttext=\"b_{1}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">b</mml:mi><mml:mn mathsize=\"80%\">1</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "b_{1}"
                      }
                    },
                    " Standard compliance, ",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m7\" alttext=\"b_{2}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">b</mml:mi><mml:mn mathsize=\"80%\">2</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "b_{2}"
                      }
                    },
                    " Incremental updates, ",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m8\" alttext=\"b_{3}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">b</mml:mi><mml:mn mathsize=\"80%\">3</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "b_{3}"
                      }
                    },
                    " Projection on properties"
                  ]
                }
              ]
            },
            {
              "type": "TableRow",
              "cells": [
                {
                  "type": "TableCell",
                  "content": [
                    "Archives"
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    "Software Heritage, Internet archive"
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m9\" alttext=\"c_{1}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">c</mml:mi><mml:mn mathsize=\"80%\">1</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "c_{1}"
                      }
                    },
                    " Fetch everything, ",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m10\" alttext=\"c_{2}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">c</mml:mi><mml:mn mathsize=\"80%\">2</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "c_{2}"
                      }
                    },
                    " Reduce traffic, ",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m11\" alttext=\"c_{3}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">c</mml:mi><mml:mn mathsize=\"80%\">3</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "c_{3}"
                      }
                    },
                    " Traceability of versions"
                  ]
                }
              ]
            },
            {
              "type": "TableRow",
              "cells": [
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "Emphasis",
                      "content": [
                        "Search engines"
                      ]
                    }
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    "Firefox search plugin"
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m12\" alttext=\"d_{1}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">d</mml:mi><mml:mn mathsize=\"80%\">1</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "d_{1}"
                      }
                    },
                    " Selection of individual items,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m13\" alttext=\"d_{2}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">d</mml:mi><mml:mn mathsize=\"80%\">2</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "d_{2}"
                      }
                    },
                    " High throughput,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m14\" alttext=\"d_{3}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">d</mml:mi><mml:mn mathsize=\"80%\">3</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "d_{3}"
                      }
                    },
                    " End-user-friendly formats,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m15\" alttext=\"d_{4}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">d</mml:mi><mml:mn mathsize=\"80%\">4</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "d_{4}"
                      }
                    },
                    " Various representations,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m16\" alttext=\"d_{5}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">d</mml:mi><mml:mn mathsize=\"80%\">5</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "d_{5}"
                      }
                    },
                    " Fuzzy search,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m17\" alttext=\"d_{6}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">d</mml:mi><mml:mn mathsize=\"80%\">6</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "d_{6}"
                      }
                    },
                    " Formula search"
                  ]
                }
              ]
            },
            {
              "type": "TableRow",
              "cells": [
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "Emphasis",
                      "content": [
                        "Individuals"
                      ]
                    }
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    "Blog on specific topic, Personal reference list"
                  ]
                },
                {
                  "type": "TableCell",
                  "content": [
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m18\" alttext=\"e_{1}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">e</mml:mi><mml:mn mathsize=\"80%\">1</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "e_{1}"
                      }
                    },
                    " Easy to setup,\n",
                    {
                      "type": "MathFragment",
                      "mathLanguage": "mathml",
                      "text": "<mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" id=\"S1.F1.m19\" alttext=\"e_{2}\" display=\"inline\"><mml:msub><mml:mi mathsize=\"80%\">e</mml:mi><mml:mn mathsize=\"80%\">2</mml:mn></mml:msub></mml:math>",
                      "meta": {
                        "altText": "e_{2}"
                      }
                    },
                    " Long term stability"
                  ]
                }
              ]
            }
          ]
        }
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p3",
      "content": [
        {
          "type": "Emphasis",
          "content": [
            "Bibliographic consumers"
          ]
        },
        " are information systems that display references\nto scientific publications. They often deal with user-generated content that\nreferences individual research articles. For websites like Wikipedia or\nMathOverflow, users interactively search for references to support their\nstatements. The remote information system, e.g., MathOverflow, sends the user’s\nsearch-string to a designated zbMATH API endpoint, which then returns a ranked\nlist of possible references. The remote information system takes care of the\nformatting. While MathOverflow, for instance, might use zbMATH exclusively,\nothers, such as Wikipedia, might want to fuse results from zbMATH with results\nfrom other providers of bibliographic metadata. Standardized protocols\ndrastically reduce the implementation effort for intra-domain information\nsystems. Even before the transformation to zbMATH open, we provided a simple\nAPI for MathOverflow [",
        {
          "type": "Cite",
          "target": "bib-bib4",
          "content": [
            "4"
          ]
        },
        "], which was limited to the top three search\nresults. This legal restriction has now vanished. In contrast, to interactive\nbibliographic customers described before, arXiv and other publishers might use\nzbMATH’s bibliographic metadata to disambiguate references, which is an\nessential prerequisite for many information retrieval tasks such as\nrecommendations, semantic searches, or plagiarism detection [",
        {
          "type": "Cite",
          "target": "bib-bib3",
          "content": [
            "3"
          ]
        },
        "]."
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p4",
      "content": [
        {
          "type": "Emphasis",
          "content": [
            "Aggregators"
          ]
        },
        " such as the OpenAIRE research explorer, SemanticScholar,\nDataCite, or Altmetric extract information from different sources, transform\nthem to standardized representations and load them into their specific data\nmodels. Additionally, in some countries, researchers are also required to\nreport their publications to government platforms. At the end of the process,\nfunding agencies or other decision-makers can use these data sources for\nso-called data-driven decision making. Here standardized interfaces and formats\nevolved to simplify the aggregators’ job, as crawling through web-pages\noptimized for human consumption is error-prone and involves complicated\nheuristics that are fragile and vulnerable to layout changes.\n"
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p5",
      "content": [
        {
          "type": "Emphasis",
          "content": [
            "Archives"
          ]
        },
        " such as the Internet Archive and Software Heritage capture the\ndigital history in the forms of websites or software code. Since their mission\nis digital preservation, an API that enables replaying the entire history of\nthe website would be ideal. Moreover, they strive to reduce traffic\nconsumption and avoid redundancy. Their infrastructure is optimized to\npreserve HTML websites in the form presented to a user at a particular point in\ntime."
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p6",
      "content": [
        {
          "type": "Emphasis",
          "content": [
            "Search Engines"
          ]
        },
        " might use our API to present search results in a\ndifferent format. For instance, Mozilla Firefox has a built-in feature to\ninclude custom search engines that implement the OpenSearch standard. One\ninteresting feature to consider is if and how mathematical formulæ are\nrepresented in OpenSearch."
      ]
    },
    {
      "type": "Paragraph",
      "id": "S1.p7",
      "content": [
        {
          "type": "Emphasis",
          "content": [
            "Individuals"
          ]
        },
        " or small groups of people with particular needs are of\nparticular importance to us. We aim to provide as much support as possible to\nresearch groups, either from mathematics or from the field of bibliometric\nresearch. Highly motivated individuals who aim to use our data creatively are\nalso on our schedule. Here, we are open to requests, and need to investigate\npotential uses case-by-case. A typical, not too exceptional use case we\nenvision would be to set up a personal publication list or to enrich a personal\nwebsite with the latest news of specific Mathematics Subject Classification\n(MSC) classes. While many of these functionalities are already possible with\nzbMATH’s news-feed functionality, we expect the API functionality to be more\nflexible."
      ]
    },
    {
      "type": "Heading",
      "id": "S2",
      "depth": 1,
      "content": [
        "2 First steps towards APIs"
      ]
    },
    {
      "type": "Paragraph",
      "id": "S2.p1",
      "content": [
        "Given the diverse expectations and needs described above, we do not see a\none-size-fits-all solution that would fulfill the diverse requirements.\nTherefore, we decided to pursue an iterative approach to building API\nsolutions. As a first step, we aim to start with a first API version that is\nwell-established, easy to implement, and has a substantial positive impact on\nworking mathematicians. According to our analysis, aggregators, archives, and\nbibliometric researchers commonly use the Open Archives Initiative Protocol for\nMetadata Harvesting (OAI-PMH). OAI-PMH seems to be well-established,\nsufficiently documented, and relatively easy to implement. This protocol is\nalso well-suited to downloading, i.e. to harvesting the entire open collection\nof zbMATH document data. These data come along with a CC-BY-SA license, which\nfacilitates both reusability and allowing derived work to remain in the open\necosystem, although this comes at the cost that some third-party content (such\nas abstracts) is not included due to legal constraints. Additionally, one may\nharvest well-defined subsets and consume updates since the last download\nwithout requiring to redownload a dump. We expect that this format will also\nbe well-suited to individuals working with zbMATH data. Especially consumers\nthat work with other datasets besides zbMATH will appreciate the standardized\nfunctionality of the protocol. However, the format is not well-suited for\nbibliographic consumers, given the overhead caused by the standard and the lack\nof flexibility. Because of this, we decided to create at least two APIs, with\nthe OAI-PMH API as a starting point, and other more flexible APIs to be\ndetermined."
      ]
    },
    {
      "type": "Figure",
      "id": "S2-F2",
      "caption": [
        {
          "type": "Paragraph",
          "content": [
            "A conceptual overview of the zbMATH database and the data flow from an\nto the database"
          ]
        }
      ],
      "content": [
        {
          "type": "ImageObject",
          "contentUrl": "0-schubotz-1.svg",
          "mediaType": "image/svg+xml",
          "meta": {
            "inline": false
          }
        }
      ]
    },
    {
      "type": "Paragraph",
      "id": "S2.p2",
      "content": [
        "In Figure ",
        {
          "type": "Cite",
          "target": "S2-F2",
          "content": [
            "2"
          ]
        },
        ", we display a possible scenario for zbMATH’s\nfuture API development efforts. The blue boxes (Reviewer Interface, Internal\nInterfaces, zbMATH database, and zbMATH Website) show the well-established\ncomponents of zbMATH. The dark gray box (OAI-PMH API) shows the newly released\nAPI described in this paper. With this setup, all write operations to the\ndatabase will be performed from the reviewer interface and other internal\ninterfaces. In contrast, the Website and the OAI-PMH interfaces are read-only\ninterfaces that present the zbMATH database’s contents without modifying it."
      ]
    },
    {
      "type": "Paragraph",
      "id": "S2.p3",
      "content": [
        "As a next step, we are working on an API to create links from zbMATH to\nexternal sources and vice versa. One commonly accepted format to describe these\nconnections is the Scholix format. Therefore we labeled this project “Scholix\nLink API” in Figure ",
        {
          "type": "Cite",
          "target": "S2-F2",
          "content": [
            "2"
          ]
        },
        ". Independently of this task, we are\nalso working on a general-purpose API (Gray box Custom API in the figure) that\nreplicates the current website’s functionality but produces the results in a\nbetter machine-processible form, such as JSON instead of HTML. In theory, one\nmight use this API for a far-future version of the zbMATH website, given that\nefficient caching layers are implemented. While we are pursuing the linking\nand custom API efforts in parallel, our goal is to limit as far as possible the\nnumber of distinct API endpoints. Another vital link will be a bidirectional\nlink to research data in the context of the German Mathematical Research Data\nInitiative (MaRDI), a consortium formed for applications within the National\nResearch Data Infrastructure (see [",
        {
          "type": "Cite",
          "target": "bib-bib1",
          "content": [
            "1"
          ]
        },
        "]). In the MaRDI project (light\ngray box at the top), we plan to repurpose the generic WikiBase knowledge graph\nsoftware that supports many well-established structured graph data exchange\nprotocols, such as RDF and SPARQL among others."
      ]
    },
    {
      "type": "Heading",
      "id": "S3",
      "depth": 1,
      "content": [
        "3 Implementation"
      ]
    },
    {
      "type": "Paragraph",
      "id": "S3.p1",
      "content": [
        "Given the above motivations, we have implemented a first version of the OAI-PMH\ninterface. The current demo is available from\n",
        {
          "type": "Link",
          "target": "https://purl.org/zb/10",
          "content": [
            "purl.org/zb/10"
          ]
        },
        ". As required by the\nprotocol, our OAI-PMH api offers six endpoints, namely (1 Identify,\n2 ListMetadataFormats, 3 ListSets, 4 ListIdentifiers, 5 ListRecords,\n6 GetRecord):"
      ]
    },
    {
      "type": "Paragraph",
      "content": []
    },
    {
      "type": "List",
      "items": [
        {
          "type": "ListItem",
          "content": [
            {
              "type": "Paragraph",
              "id": "S3.I1.i1.p1",
              "content": [
                "Endpoint 1 helps aggregators and archives to discover the new API fully\nunsupervised, identify which version of the OAI-PMH standards we are using,\nand other technicalities."
              ]
            }
          ]
        },
        {
          "type": "ListItem",
          "content": [
            {
              "type": "Paragraph",
              "id": "S3.I1.i2.p1",
              "content": [
                "Endpoint 2 lists the formats that we use to expose zbMATH data. We\nimplemented two flavors. The first is the required standard Dublin Core\nMetadata Record format, which contains standardized fields like abstract,\npublisher, creator, or title. For the second, we implemented a format that\nis closer to zbMATH’s internal data model. Many domain-specific\nclassifications can be expressed in terms of Dublin core vocabulary.\nHowever, the MSC is not predefined in the Dublin core standard, even though\nit seems to us that it could be modeled. However, expressing all the details\nof zbMATH’s data in Dublin core terms would require an immense effort of\ncoordination with librarians to ensure that our encodings are modeled\naccording to common best practices for modelling specifics in Dublin core.\nIn other words, we are addressing the issue from two ends. With the DC\nstandard, we encode the low hanging fruits in a standard way. With our\nadditional zbMATH custom format, we ensure that we expose all the data we are\nlegally allowed to by the API."
              ]
            }
          ]
        },
        {
          "type": "ListItem",
          "content": [
            {
              "type": "Paragraph",
              "id": "S3.I1.i3.p1",
              "content": [
                "Endpoint 3 lists the subsets of the zbMATH dataset that we think could be\nrelevant. In the first version, we provide the following sets: document type,\nyear, document author, classification, keyword, document language, author\nvariation, author reference, biographic reference, software, review type,\nreview language, reviewer, serial publisher. For example, the set\n",
                "document_author:Noether, Emmy",
                " is the subset of all zbMATH entries\nauthored by Emmy Noether. As an extension to OAI-PMH’s built-in set logic,\nwe implemented magic characters ",
                "|&~",
                " that indicate the standard set\noperations ‘or’, ‘and’, and ‘not’ respectively, allowing users to combine\nsets at will. Obviously, in endpoint 3, we only enumerate the 1 125 144\nbase sets."
              ]
            }
          ]
        },
        {
          "type": "ListItem",
          "content": [
            {
              "type": "Paragraph",
              "id": "S3.I1.i4.p1",
              "content": [
                "Endpoints 4 and 5 list the currently 4 206 870 list zbMATH identifiers\nand records, respectively. This endpoint is predestined to obtain a dump of\nall public zbMATH open data. Here OAI-PMH’s built-in cursor and resumption\nmechanism ensure an efficient and seamless retrieval of the data. For the\nconvenience of end-users, one can use one of the many available OAI-PMH\nmetadata harvesters from\n",
                {
                  "type": "Link",
                  "target": "https://www.openarchives.org/pmh/tools/",
                  "content": [
                    "www.openarchives.org/pmh/tools/"
                  ]
                },
                "\nto retrieve all the data."
              ]
            }
          ]
        },
        {
          "type": "ListItem",
          "content": [
            {
              "type": "Paragraph",
              "id": "S3.I1.i5.p1",
              "content": [
                "Endpoint 6 gets individual zbMATH entries."
              ]
            }
          ]
        }
      ],
      "order": "Unordered"
    },
    {
      "type": "Paragraph",
      "content": [
        "To conclude with a real example, assume that we want to retrieve the OAI-PMH\nmetadata for the entry with zblnumber\n",
        {
          "type": "Link",
          "target": "https://zbmath.org/?q=an:1200.35057",
          "content": [
            "1200.35057"
          ]
        },
        ". We would first need to\nretrieve the corresponding internal identifier (DE number), which can be done\nby clicking on the BibTeX button below the article. In this example, a BibTeX entry with key zbMATH05797851 will be downloaded open. The last digits\nfollowing the word zbMATH, i.e., 5797851, are the DE number. One can then use\nthis number to retrieve the metadata from the API by appending the query"
      ]
    },
    {
      "type": "QuoteBlock",
      "content": [
        {
          "type": "Paragraph",
          "content": [
            "verb=GetRecord&identifier=oai:zbmath.org:",
            "5797851&metadataPrefix=oai_zb_preview"
          ]
        }
      ]
    },
    {
      "type": "Paragraph",
      "content": [
        "to the root of the API endpoint in the browser. Here “verb” identifies the\nendpoint (6=GetRecord), and the DE number is prefixed with identifier prefix\nand postfixed with the desired metadata format. The browser will then display\na large XML file that contains the review text and other public information\navailable on the zbMATH website. See\n",
        {
          "type": "Link",
          "target": "https://purl.org/zb/11",
          "content": [
            "purl.org/zb/11"
          ]
        },
        " for comparison to the\nwebsite view. One can use this method to obtain any document from the zbMATH\nopen database without downloading large sets of articles."
      ]
    },
    {
      "type": "Heading",
      "id": "S4",
      "depth": 1,
      "content": [
        "4 Conclusion"
      ]
    },
    {
      "type": "Paragraph",
      "id": "S4.p1",
      "content": [
        "We have introduced the target audience of our API, discussed our strategy of\nrolling out APIs to cover a wide range of potential users, and described\ndetails of our API infrastructure’s first pillar. While our plans for future\nendpoints are subject to change and the current OAI-PMH endpoint is subject to\ncontinual improvement, we have taken the first step towards standardized\nmachine interfaces to make the data of zbMATH available to a broader audience."
      ]
    },
    {
      "type": "Paragraph",
      "id": "authorinfo",
      "content": [
        "Moritz Schubotz is a senior researcher for mathematical\ninformation retrieval and open science. He maintains the support for\nmathematical formulæ in Wikipedia and is off-site collaborator at NIST.\n\n",
        {
          "type": "Link",
          "target": "mailto:moritz.schubotz@fiz-karlsruhe.de",
          "content": [
            "moritz.schubotz@fiz-karlsruhe.de"
          ]
        },
        "\nOlaf Teschke is Managing editor of zbMATH and Vice-chair of the EMS Committee\non publications and electronic dissemination.\n\n",
        {
          "type": "Link",
          "target": "mailto:olaf.teschke@fiz-karlsruhe.de",
          "content": [
            "olaf.teschke@fiz-karlsruhe.de"
          ]
        }
      ]
    }
  ]
}