Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Your extractor will contain several files. The ones that will be used by the Simple Extractor Wrapper are listed below. The instructions below will help you to create these files:

    • my_python_program.py (required): For simplicity, let us call the Python file that contains the main function my_python_program.py, the main function my_main_function, and your extractor my_extractor.
    • extractor_info.json (required): Contains metadata about the extractor

    • Dockerfile (required): Contains instructions to create a docker image of your extractor

    • requirements.txt (optional): Contains names of Python packages that will be installed using the pip command.

    • packages.apt (optional): Contains names of Linux packages that will be installed using the apt-get command.

  1. Create and save extractor_info.json using any text editor in your source directory. This file contains the metadata about the extractor that you are creating. Please fill in the relevant details about the extractor in this file. This document follows the JSON-LD standard. A template extractor_info.json has been provided below for reference. As you can see, you can fill in the details like name, version, author, contributors, source code repository, docker image name, the data types on which the extractor will work, external services used, any dependent libraries, BibTex  format citation to a list of publications that the extractor is referring to, etc.:
Code Block
languagejs
themeConfluence
linenumberstrue
collapsetrue
{
   "@context": "<context root URL>",
   "name": "<extractor name>",
   "version": "<version number>",
   "description": "<extractor description>",
   "author": "<first name> <last name> <<email address>>",
   "contributors": [
       “<first name> <last name> <<email address>>”,
       “<first name> <last name> <<email address>>”,
     ],
   "contexts": [
    {
       "<metadata term 1>": "<URL definition of metadata term 1>",
        "<metadata term 2>": "<URL definition of metadata term 2>",
     }
   ],
   "repository": [
      {
	"repType": "git",
    	 "repUrl": "<source code URL>"
      }
   ],
   "process": {
     "file": [
       "<MIME type/subtype>",
       "<MIME type/subtype>"
     ]
   },
   "external_services": [],
   "dependencies": [],
   "bibtex": []
 }