Active Learning of Automata for JSON-Streaming Validation

Staquet, Gaëtan

Download

Unpublished conference/Abstract (Scientific congresses and symposiums)

Active Learning of Automata for JSON-Streaming Validation

Staquet, Gaëtan

2022 • Highlights 2022 of Logic, Games and Automata

Permalink
https://hdl.handle.net/20.500.12907/43119

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

highlights2022.pdf

Author postprint (166.66 kB)

Download

All documents in ORBi UMONS are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

validation; automata learning; streaming; JSON document

Abstract :

[en] In the recent years, JSON documents have become an important way of transferring information between computers. Notably, JSON documents contain two types of sequences that can be arbitrarily nested: an ordered collection of key-value pairs, and an unordered collection of values. To ensure that the communication can be done, i.e., that the receiver understands the message, a document should be structured in a specific way. This structure can be described by a JSON schema, which is itself a JSON document. With a schema, the receiver can verify that the document is valid, i.e., correct with regards to the schema. The state of the art consists in exploring the tree encoded in a JSON schema and verifying that there is a branch such that the JSON document satisfies all the constraints given on the branch. In this joint work with Véronique Bruyère (University of Mons) and Guillermo A. Pérez (University of Antwerp), we propose an alternative approach to JSON document validation based on automata. The advantage of having a finite-state model to validate a document is that validating a JSON document in a streaming context becomes trivially efficient [1]. More precisely, we consider three types of automata: realtime one-counter automata (ROCAs), visibly one-counter automata (VCAs), and visibly pushdown automata (VPAs). Thanks to active learning algorithms [2, 3, 4], we can construct an automaton that behaves as a (simplified) validator. Our experiments show that VPAs are the most promising family of automata as they can be learned in a reasonable amountof time, and their stack enables them to easily verify whether the document is well-nested. However, to make the learning process more efficient, we assume a fixed order for every sequence that appears in the JSON documents. Therefore, the automaton can only process documents following this order. This is problematic as, in a streaming context, some of the key-value pairs might appear in a different order. An easy fix would be to consider every possible permutation of key-value pairs but it would lead to an exponential blowup in the automaton’s size. We thus want to explore alternative solutions to allow permutation of the pairs without this exponential blowup. [1] Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley series in computer science / World student series edition. Addison-Wesley, 1986. [2] Véronique Bruyère, Guillermo A. Pérez, Gaëtan Staquet. Learning Realtime One-Counter Automata. TACAS, 2022. [3] Daniel Neider, Christof Löding. Learning Visibly One-Counter Automata in Polynomial Time. Technical Report AIB-2010-02, RWTH Aachen (January 2010), 2010. [4] Malte Isberner. Foundations of Active Automata Learning: an Algorithmic Perspective. Technical University Dortmund, Germany, 2015.

Research center :

CREMMI - Modélisation mathématique et informatique

Disciplines :

Computer science

Author, co-author :

Staquet, Gaëtan ; Université de Mons - UMONS > Faculté des Sciences > Service d'Informatique théorique ; UA - Université d'Anvers [BE] > Computer Science

Language :

English

Title :

Active Learning of Automata for JSON-Streaming Validation

Publication date :

29 June 2022

Event name :

Highlights 2022 of Logic, Games and Automata

Event organizer :

Université Paris Cité

Event date :

From 28 June to 01 July 2022

Audience :

International

Research unit :

S829 - Informatique théorique

Research institute :

R150 - Institut de Recherche sur les Systèmes Complexes
R300 - Institut de Recherche en Technologies de l'Information et Sciences de l'Informatique

Name of the research project :

5104 - ASP-Staquet - Artwork - Fédération Wallonie Bruxelles

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique

Available on ORBi UMONS :

since 02 July 2022

Statistics

Number of views

41 (3 by UMONS)

Number of downloads

17 (0 by UMONS)

More statistics