README.md 3.86 KB
Newer Older
Nicolas Sannier's avatar
Nicolas Sannier committed
1
2
# Open ARMLET

3
4
5
6
Open ARMLET (Automated Retrieval of Metadata in Legal Texts) is the open source version of ARMLET, a framework aimed, in this version, at retrieving structural metadata, e.g., titles, chapters, articles, paragraphs, alineas as well as cross-references from legal texts, more specifically legislative acts.

The goal of the ARMLET project is to provide a configuratble framework for automated extraction of legal metadata from legal texts to support activities on legal texts such as open data, legal search, legal compliance checking legal data management and targets more particularly the conversion of legislative texts from doc(x) or PDF to XML for open and transparent legislative data

Nicolas Sannier's avatar
Nicolas Sannier committed
7
8
9
10
This project has been developed over the years at the Interdisciplinary Centre for Security, Reliability and Trust from the University of Luxembourg, in collaboration with the Central Legislative Service (Service central de législation) from the Ministry of State of Luxembourg.
In particular, ARMLET has been developed, tested and improved over several legislative acts, including codes, laws and regulations. 
Some configuration examples provided in the projects are actual legislative acts published on the legilux portal (http://legilux.public.lu), the online official gazette of the Grand-Duchy of Luxembourg.

Nicolas Sannier's avatar
Nicolas Sannier committed
11
# License and Copyright
Nicolas Sannier's avatar
Nicolas Sannier committed
12
13
14
15
16
17
18
19
ARMLET is Copyright © University of Luxembourg / Interdisciplinary Centre for Security, Reliability and Trust - 2014-2019
Acknowledged co-inventors (by alphabetical order): 
- Morayo Adedjouma (morayoade@gmail.com)
- Lionel Briand (briand@svv.lu)
- Wei Dou (dou@svv.lu) 
- Mehrdad Sabetzadeh (mike@svv.lu)
- Nicolas Sannier (sanniver@svv.lu)
- Virgil Tassan (virgil.tassan@gmail.com)
Nicolas Sannier's avatar
Nicolas Sannier committed
20
21
22
23
24

ARMLET is released under licenced under the GNU Library General Public License, version 3 of June 2007 (referred to as "The GNU License" below). 
We make use of several of 3rd-party tools which are licenced separately, but these licences are compatible with the license.
In particular, we use the NLP framework Gate (http://gate.ac.uk), also available under LGPL 3 licence.

25
26
27
28
29
30
31
32
33
34
35
# Folder Organization

- Note that ARMLET is built upon the NLP framework Gate (https://gate.ac.uk), an open source framework developed and maintained by the University of Sheffield. (Gate 8.1 and Libs folders).
- The NLP framework leverages NLP scripts in the jape language, that can be bundled into applications and executed over the texts. In ARMLET, many of the scripts we use are aimed to be executed as-is ov the text (Jape folder), whereas some will be automatically generated or reused (CustomJape folder), depending of the configuration you bring in. 
- Some configuration files can be found in the Test folder. 
- Regarding the structure configuration, they are defined according to two files, a <name>_structure.json file and <name>_rules.json file. The first one is aimed at describing the structural hierarchy of the document, the second one is aimed at describing the detection rules for these structures. More details are provided in the documentation. These can be found in the Structure and CustomStructure folders. All apologies for the current mess in the folder. Note that the file location is quite irrelevant as long as both files are correctly referenced in the configuration file and that the naming convention is correct. From these two files, ARMLET will automatically generate jape scripts and Gate applications in the CustomJape and Apps folders for the detection of structural metadata.


# Installation and setup

ARMLET comes as a whole and is executed via calling the ArmletMain.jar with one argument, which is the path to a configuration file containing the necessary information for converting a document into xml.
Nicolas Sannier's avatar
Nicolas Sannier committed
36
Therefore, elements location should not be changed since the ArmletMain.Jar will dynamically call Gate and its plug-ins at precise locations.
37
38
More information can be found in the documentation folder.