README.md 5.69 KB
Newer Older
Nicolas Sannier's avatar
Nicolas Sannier committed
1
2
# Open ARMLET

3
Open ARMLET (Automated Retrieval of Metadata in Legal Texts) is the open source version of ARMLET, a framework aimed, in this version, at retrieving structural metadata, e.g., titles, chapters, articles, paragraphs, alineas as well as cross-references from legal texts, more specifically legislative acts.  
Nicolas Sannier's avatar
Nicolas Sannier committed
4
5
The goal of the ARMLET project is to provide a configurable framework for automated extraction of legal metadata from legal texts to support activities on legal texts such as open data, legal search, legal compliance checking legal data management and targets more particularly the conversion of legislative texts from doc(x) or PDF to XML for open and transparent legislative data

Marcello Ceci's avatar
Marcello Ceci committed
6
This project has been developed over the years at the Interdisciplinary Centre for Security, Reliability and Trust (SnT) from the University of Luxembourg, in collaboration with the Central Legislative Service (Service central de législation) from the Ministry of State of Luxembourg.  
Nicolas Sannier's avatar
Nicolas Sannier committed
7
In particular, ARMLET has been developed, tested and improved over several legislative acts, including codes, laws and regulations. 
8

Nicolas Sannier's avatar
Nicolas Sannier committed
9
10
11
12
Some configuration examples provided in the projects are actual legislative acts published on the legilux portal (http://legilux.public.lu), the online Official Gazette of the Grand-Duchy of Luxembourg.

# License and Copyright

Marcello Ceci's avatar
Marcello Ceci committed
13
ARMLET is copyrighted © by University of Luxembourg / Interdisciplinary Centre for Security, Reliability and Trust - 2014-2019.
Nicolas Sannier's avatar
Nicolas Sannier committed
14
15
16

Acknowledged co-authors (by alphabetical order):           

17
- Adedjouma Morayo (morayoade@gmail.com)
Marcello Ceci's avatar
Marcello Ceci committed
18
19
- Briand Lionel (lionel.briand@uni.lu)
- Dou Wei
Marcello Ceci's avatar
Marcello Ceci committed
20
- Sabetzadeh Mehrdad (mehrdad.sabetzadeh@uni.lu)
Marcello Ceci's avatar
Marcello Ceci committed
21
- Sannier Nicolas (nicolas.sannier@wanadoo.fr)
22
- Tassan-Zanin-Caser Virgil (virgil.tassan@gmail.com)
Nicolas Sannier's avatar
Nicolas Sannier committed
23
24


25
26
27
ARMLET is released under the GNU LESSER GENERAL PUBLIC LICENSE, version 3 of June 2007 (referred to as "the License" below). Please consider a careful reading of the license file (licence.html) in the project folder.  
We make use of several 3rd-party tools which are licensed separately, but these licenses are compatible with the License.  
In particular, we use the NLP framework Gate (http://gate.ac.uk), also available under LGPLv3 license.
Nicolas Sannier's avatar
Nicolas Sannier committed
28

29
This software is delivered "as is"  by the University of Luxembourg  and the co-authorswith no warranties  whatsoever, including but not limited to any warranty of use, implementation, integration, merchantability, non-infringement, fitness for any particular purpose or other warranty otherwise arising out of any proposal, specification or sample.
Nicolas Sannier's avatar
Nicolas Sannier committed
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50


# Folder Organization

- Note that ARMLET is built upon the NLP framework Gate (https://gate.ac.uk), an open source framework developed and maintained by the University of Sheffield. (Gate 8.1 and Libs folders).
- The NLP framework leverages NLP scripts in the jape language, that can be bundled into applications and executed over the texts. In ARMLET, many of the scripts we use are aimed to be executed as-is over the text (Jape folder), whereas some will be automatically generated or reused (CustomJape folder), depending of the configuration you bring in. 
- Some configuration files can be found in the Test folder. 
- Regarding the structure configuration, they are defined according to two files, a <name>_structure.json file and <name>_rules.json file. The first one is aimed at describing the structural hierarchy of the document, the second one is aimed at describing the detection rules for these structures. More details are provided in the documentation. Example configurations can be found in the Structure and CustomStructure folders. Note that the file location is quite irrelevant as long as both files are collocated together and  correctly referenced in the configuration file and that the naming convention is correct. From these two files, ARMLET will automatically generate jape scripts and Gate applications in the CustomJape and Apps folders for the detection of structural metadata.


# Installation and Setup

ARMLET comes as a whole and is executed via calling the ArmletMain.jar with one argument, which is the path to a configuration file containing the necessary information for converting a document into xml.
Therefore, elements location should not be changed since the ArmletMain.Jar will dynamically call Gate and its plug-ins at precise locations.

More information can be found in the documentation folder.

# Related Publications


**- Digitizing Luxembourg's Legal Corpora: Experience and Vision**  
Marcello Ceci's avatar
Marcello Ceci committed
51
*John Dann, Mehrdad Sabetzadeh, Nicolas Sannier, and Lionel Briand (do not follow the contribution order nor alphabetical order), in the 18th Law via the Internet conference (LVI'2018), Florence, Italy, 11-12 October 2018.*
Nicolas Sannier's avatar
Nicolas Sannier committed
52
53

**- Legal Markup Generation in the Large: An Experience Report**  
Marcello Ceci's avatar
Marcello Ceci committed
54
*Nicolas Sannier, Morayo Adedjouma, Mehrdad Sabetzadeh, Lionel C. Briand, John Dann, Marc Hisette, Pascal Thill, in the proceedings of the 25th IEEE International Requirements Engineering Conference (RE'2017), Lisbon, Portugal, pp 302-311, 4-8 September  2017. Available at: https://orbilu.uni.lu/handle/10993/31825 *
Nicolas Sannier's avatar
Nicolas Sannier committed
55
56

**- From RELAW Research to Practice: Reflections on an Ongoing Technology Transfer Project**  
Marcello Ceci's avatar
Marcello Ceci committed
57
*Nicolas Sannier, Mehrdad Sabetzadeh and Lionel C. Briand, in the proceedings of the IEEE 25th International Requirements Engineering Conference Workshops (RELAW), Lisbon, Portugal, pp 204?208, September 5th, 2017. Available at: https://orbilu.uni.lu/handle/10993/33126 *
Nicolas Sannier's avatar
Nicolas Sannier committed
58
59

**- An Automated Framework for Detection and Resolution of Cross References in Legal Texts**  
Marcello Ceci's avatar
Marcello Ceci committed
60
*Nicolas Sannier, Morayo Adedjouma, Mehrdad Sabetzadeh, and Lionel Briand. , in the Requirements Engineering Journal 22(2): 215-237 (2017) (1st online Nov. 17, 2015). Available at: https://orbilu.uni.lu/handle/10993/22286 *