Editing Documentation:How to use the Parser

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
Before reading this article you need to be familiar with the basic syntax of events in a layer, which is described [[Documentation:The Editor and data format|here]].
Before reading this article you need to be familiar with the basic syntax of events in a layer, which is described here.


You can open the Parser by clicking the ''Parser'' button in the bottom right corner of the Editor.
You can open the Parser by clicking the Parser button in the bottom right corner of the Editor.


[[File:ParserButton.png|600px]]
[[File:ParserButton.png|600px]]
Line 9: Line 9:
[[File:ParserEmpty.png|800px]]
[[File:ParserEmpty.png|800px]]


==The workflow==
The workflow
You need to put the source text (usually wiki text from a Wikipedia article) in the left field. Then, using the buttons at the bottom you highlight the needed information in the text: event texts and dates. The techniques for doing that will be described in this article. After that you click ''Move selected'' button and the data is moved into the right field. Usually you need to format the data a bit more using the buttons below the right text field and then you can copy the resulting text and paste it into the Editor so that you can test the data on the timeline.
You need to put the source text (usually wiki text from a Wikipedia article) in the left field. Then, using the buttons at the bottom you highlight the needed information in the text: event texts and dates. The techniques for doing that will be described in this article. After that you click ‘Move selected’ button and the data is moved into the right field. Usually you need to format the data a bit more using the buttons below the right text field and then you can copy the resulting text and paste it into the Editor so that you can test the data on the timeline.


==Example==
Example
Let’s use US presidents as an example. Suppose you want to create a layer of US presidents (even though such layer already exists).
Let’s use US presidents as an example. Suppose you want to create a layer of US presidents (even though such layer already exists).


First, you’d go to the corresponding [https://en.wikipedia.org/wiki/List_of_presidents_of_the_United_States Wikipedia page] and open its source wiki text.  
First, you’d go to the corresponding Wikipedia page and open its source wiki text.  


[[File:EditSource.png|300px]]
Then you’d copy the wiki text and paste into the input field of the Parser.


Then you’d copy the wiki text and paste it into the input field of the Parser.
Now we need to mark all the presidents with red color, and all the dates with orange color. You can see that in the left bottom corner of the screen there is a button that is red already. And it says ‘Texts’. Thats exactly what we need. That means we are in text selection mode. To select dates you will need to change the mode to single date selection mode or double dates selection mode. For now don’t do anything with the red button.


[[File:PresidentsSourceText.png|500px]]
If you click ‘Select’ button, by default all the links will be selected:
 
Now you need to mark all the presidents with red color, and all the dates with orange color. You can see that in the left bottom corner of the screen there is a button that is red already. And it says ‘Texts’. Thats exactly what we need. That means we are in the text selection mode. To select dates you will need to change the mode to single date selection mode or double dates selection mode. For now don’t do anything with the red button.
 
If you click ''Select'' button, by default all the links will be selected:
 
[[File:PresidentsTooMuchLinks.png|500px]]


You can manually unhighlight all the links that you don’t need (by selecting a range of text containing the highlighted parts and pressing Option/Alt key on the keyboard), but that’s too much work.
You can manually unhighlight all the links that you don’t need (by selecting a range of text containing the highlighted parts and pressing Option/Alt key on the keyboard), but that’s too much work.


Instead let’s do something else. First, clear the selection by clicking ''Clear'' button.  
Instead let’s do something else. First, clear the selection by clicking ‘Clear’ button.  


Now let’s pay attention to the fact that all the links containing presidents’ names are enclosed in triple apostrophes like this:  <code><nowiki>'''[[George Washington]]'''</nowiki></code>. We can use this fact to our advantage.  
Now let’s pay attention to the fact that all the links containing presidents’ names are enclosed in triple apostrophes like this:  '''[[George Washington]]'''. We can use this fact to our advantage.  


Put prefix <code><nowiki>'''[[</nowiki></code> and postfix <code><nowiki>]]'''</nowiki></code> in the two small input fields, and then click ''Select'' button.  
Put prefix ‘’’[[ and postfix ]]’’’ in the two small input fields, and then click Select button. Now only the presidents are highlighted.


[[File:PrefixPostfixExample.png|300px]]


Now only the presidents are highlighted.
Click on the red button to change it to orange. To find dates just click the ‘Find dates’ button. As you can see all the dates are highlighted, even the ones we don’t need. Now you have to manually unhighlight the ones that are not needed, by selecting parts of text with a mouse and pressing Option/Alt button.  


[[File:Presidents_Links.png|500px]]
Once you are done, you can click ‘Move selected’ and the data will be moved into the right field with almost correct syntax.  


Click the red button to change it to orange. To find dates just click the ''Find dates'' button. As you can see all the dates are highlighted, even the ones we don’t need.
Now you can click ‘Format dates’ button to change  


[[File:PresidentsTooMuchDates.png|500px]]


Now you have to manually unhighlight the ones that are not needed, by selecting parts of text with a mouse and pressing ''Option/Alt'' button.
James K. Polk;;;;March 4, 1845;March 4, 1849;


Once you are done, you can click ''Move selected'' button and the data will be moved into the right field with almost correct syntax.  
James K. Polk;;;;03.04.1845;03.04.1849;


[[File:Presidents_Almost_Done.png|800px]]
Now everything looks right. Copy the result and paste it into the Editor.  


Now you can click ''Format dates'' button to change date format:
A harder task
 
[[File:PresidentsResult.png|400px]]
 
 
 
Now everything looks right. Copy the result and paste it into the Editor.
 
==A harder task==
Now that was easy. And the reason it was easy is that we got lucky that all US presidents were enclosed in triple apostrophes, which made them easily distinguishable from other links. This is not usually the case however.  
Now that was easy. And the reason it was easy is that we got lucky that all US presidents were enclosed in triple apostrophes, which made them easily distinguishable from other links. This is not usually the case however.  


I didn’t want to look for another example so I just used the same source text but removed apostrophes from it.  
I didn’t want to look for another example so I just used the same source text but removed apostrophes from it. Let’s pretend that it was like that to begin with. Now there is nothing unique about the links that contain the names of US presidents.
 
[[File:PresidentsWithoutApostraphes.png|600px]]
 
Let’s pretend that it was like that to begin with. Now there is nothing unique about the links that contain the names of US presidents. What to do in this case? I would start by finding the dates first.
 
[[File:PresidentsDatesFirst.png|600px]]
 
Once dates are highlighted it is easier to visually find presidents in the text. At this point you can manually select presidents' names with a mouse and highlight them using the Control key on the keyboard. But there is an easier way. You can make the syntax around the presidents' names a bit more special than that of other links. For example, you can add an asterisk in front of the opening brackets: <code><nowiki>*[[George Washington]]</nowiki></code>.
 
Then you can add <code><nowiki>*[[</nowiki></code> and <code><nowiki>]]</nowiki></code> into the small input fields and click 'Select'. Now all the presidents will be selected.
 
[[File:PresidentsAsterisks.png|600px]]
 
==The technique==
As you can see the main technique for selecting texts is to find something unique about the surroundings of needed texts. If there is nothing unique, create such uniqueness manually. For large layers it can take some time, but it's still faster than copying and pasting all the texts manually. Here is another example:
 
[[File:LinkolnExample.png|600px]]
 
I placed an asterisk <code>*</code> between dates and texts. Then I specified asterisk as a prefix:
 
[[File:AsteriskAsPrefix.png|300px]]
 
Then I clicked ''From prefix till line end'' button to select all the texts.
 
==The advanced technique==
Sometimes the way the source text is organised is so unfortunate that it would take a crazy amount of time to prepare it for the Parser manually. In such cases you have to use a thing called regular expressions. You know how in many text editors you can search and replace pieces of text? Regular expressions is just a more advanced way of doing search and replace.
 
Let me give you an example of the situation where regular expressions have helped me. Open the [https://en.wikipedia.org/wiki/List_of_Roman_consuls list of Roman consuls] on Wikipedia and examine that list. You see how most of the time there were 2 or more consuls in each year. In the wiki text it looks like this:
 
<blockquote><poem><nowiki>
| align=center | 503
| [[Agrippa Menenius Lanatus (consul 503 BC)|Agrippa Menenius Lanatus]]
| [[Publius Postumius Tubertus|P. Postumius Tubertus]] II
|-
| align=center | 502
| [[Opiter Verginius Tricostus (consul 502 BC)|Opet. Verginius Tricostus]]
| [[Spurius Cassius Viscellinus|Sp. Cassius Viscellinus]]
|-
| align=center | 501
| [[Postumus Cominius Auruncus|Post. Cominius Auruncus]]
| [[Titus Lartius|T. Lartius]] (Flavus ''or'' Rufus)
|}
</nowiki></poem></blockquote>
 
What I needed to do was to turn a block of text that looks like this:<blockquote><poem><nowiki>
| align=center | 502
| [[Opiter Verginius Tricostus (consul 502 BC)|Opet. Verginius Tricostus]]
| [[Spurius Cassius Viscellinus|Sp. Cassius Viscellinus]]
</nowiki></poem></blockquote>
 
into a block of text that looks like this:<blockquote><poem><nowiki>
[[Opiter Verginius Tricostus (consul 502 BC)|Opet. Verginius Tricostus]];;;;09.01.-502;;
[[Spurius Cassius Viscellinus|Sp. Cassius Viscellinus]] (post.);;;;09.01.-502;;
</nowiki></poem></blockquote>
 
I could have done it manually, but the problem is that there is almost 500 years worth of Roman consuls (I only did consuls of Roman Republic).
 
Regular expressions allowed me to find every such block of text and do the needed replacements automatically. In this particular case I created the final syntax right in the text editor without using Parser at all. But a lot of times I use the text editor to prepare the text for the Parser. It all depends on the situation.
 
I’m not going to give you a lesson on how to use regular expressions in this article. Maybe someday I will make a video on this topic. Here I’ll just tell you what you need to get to start using regular expressions and where you may find useful information about them.
 
To use regular expressions you need a special text editor that programmers use. The one I’m using is VSCode. You can download it [https://code.visualstudio.com/ here]. Don’t worry, you don’t need to be a programmer to use it. Treat it as just a text editor. Your workflow will be:
Create new file, paste the text into it, do replacements. Before doing searching and replacing don’t forget to press the button with a dot and an asterisk to activate regular expressions:
 
[[File:SearchWithRegExp.png]]
 
To learn about regular expressions read this [https://docs.microsoft.com/en-us/visualstudio/ide/using-regular-expressions-in-visual-studio article].
 
In the article they use a different text editor, which you may probably use as well (I just never used it, so I recommend the one I’m familiar with), but the syntax of regular expressions is the same no matter which text editor you use.
 
==Converting Julian dates to Gregorian dates==
Sometimes (very rarely though) the sources that you use may provide only old style dates. In this case you work with the data the same way, but in the end, after the dates are formatted, you just click the ''Julian to Gregorian'' button. All the dates after the calendar reform of 1582 will be converted. The dates before the reform will stay the same. Just make sure you don’t accidentally convert a date that was Gregorian to begin with.
 
When testing such layer on the Timeline turn on Julian dates in the Menu. This way you'll be able to compare the dates on the Timeline with the dates in the sources that you used, as they will all be Julian dates.

Please note that all contributions to Timeline of History may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see Timeline of History:Copyrights for details). Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)