...
Your extractor will contain several files. The ones that will be used by the Simple Extractor Wrapper are listed below. The instructions below will help you to create these files:
- my_python_program.py (required): For simplicity, let us call the Python file that contains the main function my_python_program.py, the main function my_main_function, and your extractor my_extractor.
extractor_info.json (required): Contains metadata about the extractor
Dockerfile (required): Contains instructions to create a docker image of your extractor
requirements.txt (optional): Contains names of Python packages that will be installed using the pip command.
packages.apt (optional): Contains names of Linux packages that will be installed using the apt-get command.
- Create and save extractor_info.json using any text editor in your source directory. This file contains the metadata about the extractor that you are creating. Please fill in the relevant details about the extractor in this file. This document follows the JSON-LD standard. A template extractor_info.json has been provided below for reference. As you can see, you can fill in the details like name, version, author, contributors, source code repository, docker image name, the data types on which the extractor will work, external services used, any dependent libraries, BibTex format citation to a list of publications that the extractor is referring to, etc.:
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
{
"@context": "<context root URL>",
"name": "<extractor name>",
"version": "<version number>",
"description": "<extractor description>",
"author": "<first name> <last name> <<email address>>",
"contributors": [
“<first name> <last name> <<email address>>”,
“<first name> <last name> <<email address>>”,
],
"contexts": [
{
"<metadata term 1>": "<URL definition of metadata term 1>",
"<metadata term 2>": "<URL definition of metadata term 2>",
}
],
"repository": [
{
"repType": "git",
"repUrl": "<source code URL>"
}
],
"process": {
"file": [
"<MIME type/subtype>",
"<MIME type/subtype>"
]
},
"external_services": [],
"dependencies": [],
"bibtex": []
} |