You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
55 lines
3.0 KiB
55 lines
3.0 KiB
# Wikipedia Infobox Analyzer
|
|
|
|
On wikipedia there are different kinds of infoboxes.
|
|
Each modern infobox retrieves data from wikidata.
|
|
But due to legacy, many of the infoboxes still use manual values.
|
|
|
|
This analysis tool allows seeing the wikidata behind articles through the lens of the infoboxes.
|
|
It will detect if expected fields are missing in the wikidata, in which case the values are often manually set inside the article.
|
|
|
|
As articles in other languages are created, someone might also extend the wikidata entry.
|
|
That means that fields that were previously set manually, could now be updated to use wikidata.
|
|
So this tool can be used to analyze the used infobox template, to see which values are now present or are still missing in wikidata.
|
|
It offers an easier side-by-side comparison than going through all wikidata properties manually, as it looks only at the properties used by the infobox.
|
|
|
|
Infoboxes are interesting, and there could be plenty more to check.
|
|
But the aim of this tool is to be simple, to be used alongside the editor of wikipedia.
|
|
Analysis on an entire wikidata item is out of scope for this tool.
|
|
Read the warnings on wikipedia and wikidata for that kind of analysis.
|
|
|
|
## Usage
|
|
|
|
Here is a simple example on how this analyzer can be used for the wikipedia article about "Earth":
|
|
``` sh
|
|
wikipedia-infobox-analyzer
|
|
--title Earth
|
|
--lang en
|
|
--template <infobox_template_file>
|
|
```
|
|
|
|
By default, the tool assumes that you are looking for tools on the English wikipedia, but you can provide the language code of other wikipedia's like `fr`, `de`, `es` and `eo`.
|
|
Make sure the passed title matches the article title, and the tool should be able to find the wikidata entry.
|
|
The next section will go over what these infobox templates files are, where you can find them on wikipedia, and how you can customize them locally for your wikidata analysis.
|
|
|
|
## Templates
|
|
The wikipedia sites vary a lot when it comes to templates across the different languages.
|
|
The goal of this tool is to be universal, but these templates have not been standardized as far as I am aware.
|
|
|
|
To mitigate this, templates can be customized and expected to be downloaded for your language from wikipedia.
|
|
For instance, go to the infobox template for planets on the English wikipedia (https://en.wikipedia.org/wiki/Template:Infobox_planet) and download the source to a file.
|
|
Then you can add the following line to that file locally:
|
|
|
|
``` text
|
|
{{... Wikidata|P18|P31|P361|P571}}
|
|
```
|
|
|
|
The program ignores what you put at the `...`.
|
|
It permits templates that include a listing of wikidata entries for their templates.
|
|
As an example, this is the case on the following template on the Dutch wikipedia (the first word means "uses"):
|
|
|
|
``` text
|
|
{{Gebruikt Wikidata|P18|P154|P170|P178|P275|P277|P306|P348|P400|P548|P571|P577|P856|P1324|P2096}}
|
|
```
|
|
|
|
Ideally, these used properties would be discovered by use, but I have not found a way to do that universally.
|
|
Besides, you only have to do this once, and this way you can also customize the properties to look for.
|
|
|