Model Wizard #4

richard-churchman · 2022-12-29T06:56:48Z

Creating a machine learning model in Jube can be a convoluted process involving creating a model, specifying fields to be extracted, specifying tags and then loading data via HTTP endpoint, before being available for training in the embedded Exhaustive machine learning algorithm. The requirements contrast to products which can achieve the same through the application of a CSV file. It follows that despite having more advanced capabilities the adoption may be reduced to other products. While Jube was not designed as an automated machine learning Wizard, there appears increasing overlap

It is proposed that a Model Wizard be created to take a CSV file and parse the metadata and data itself, automatically creating all configuration elements that are otherwise created manually. The file will be parsed for its data to identify the universe of categorical variables, with these being created as Boolean XPath expressions (a process which currently is done typically outside of Jube).

Task: Ensure JSON Path Expression returns a Boolean value

As categorical data pivoting will be done in Jube, JSON Path must be available in the Request XPath Model Configuration to return based on Expression, for example, $.[?(@.=='Politician')].

Task: Create a new page to parse the CSV file

The new page called Model Wizard, existing under the Models menu item, will accept a CSV file as an upload and proceed to parse the headers. For each header the data will be inspected:

Is all numeric, in which case will be treated as Float for the purpose of model configuration.
Has the presence of string data, in which case will be treated as String for the purpose of model configuration.

In keeping with the stateless nature of the design, the parsing will be stored in tables in the database for recall by the user interface. At this stage, the model will not be created.

Task: Allocate Dependent Variable

With the metadata having been established, the page must accept further configuration parameters, specifically including the dependent variable, which will go on to be a tag value, corresponding Exhaustive Model and Activation Rule.

Task: Create Model

Based on metadata and configuration create the model in Jube comprising:

Headers will be transposed to Request XPath configuration elements.
For each String in Categorical variables the header will be transposed as an expression (i.e. Categorical Data Pivoting).
For each String in the Categorical variable specified as Dependent Variable a Tag element will be created and;
An Exhaustive configuration element will be created to target the Tag disposition for machine learning and;
For good measure, an Activation Rule element will be created targeting the return value from Exhaustive models, where > 0.5 will drive activation. The Activation Rule is not strictly necessary as the Exhaustive recall values are available in their raw form on recall.

Task: Load Data from CSV into JSON for storage in the Archive

Transpose the CSV file to a JSON representation and store it in the Archive table which will make the data available for Exhaustive training.

Task: Synchronise Model

Insert data to cause the model to synchronise and thus start Exhaustive training.

richard-churchman · 2023-01-02T06:39:19Z

Created branch.

…when used in conjunction with Boolean data types in the Request XPath definitions. See #4 for specification.

richard-churchman · 2023-01-03T11:09:28Z

Completed task to refactor JSON parsing to support JSON Path expressions. Committed to branch.

A working example of a JSON Path expression is:

$..[?(@.Brand == 'ZTE')]

Note the double stop \ point to signify that the JSON Path is to be tested against a singular object, and not an array object. This JSON Path does not work on all parsers but seems fine in JSON.net.

…gration which inserts a new Permission Specification and allocates to Administrator in Role Registry Permission. Created a page called EntityAnalysisModelWizard and included the validation of the new Permission Specification in code behind. Included the page location in the shared layout menu also on validation of the new Permission Specification. See #4 for specification.

…when used in conjunction with Boolean data types in the Request XPath definitions. See #4 for specification.

…gration which inserts a new Permission Specification and allocates to Administrator in Role Registry Permission. Created a page called EntityAnalysisModelWizard and included the validation of the new Permission Specification in code behind. Included the page location in the shared layout menu also on validation of the new Permission Specification. See #4 for specification.

richard-churchman · 2023-12-26T08:01:12Z

Picking this issue back up after a period of inactivity. Some changes to the design will include categorical variables being laid out in the Inline Function rather than using Json XPath processing. This means that the original file format is fully respected as Json and the user does not need to worry about breaking out categorical variables on recall. It was too convoluted to achieve the categorical variable pivoting in the Request XPath, which is what the Inline Function is for anyway.

richard-churchman added a commit that referenced this issue Jan 3, 2023

Refactoring of JSON payload parsing to support JSON Path expressions …

82868d2

…when used in conjunction with Boolean data types in the Request XPath definitions. See #4 for specification.

richard-churchman self-assigned this Dec 26, 2023

richard-churchman added the enhancement New feature or request label Dec 26, 2023

richard-churchman added a commit that referenced this issue Dec 26, 2023

Refactoring of JSON payload parsing to support JSON Path expressions …

bfc3b48

…when used in conjunction with Boolean data types in the Request XPath definitions. See #4 for specification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Wizard #4

Model Wizard #4

richard-churchman commented Dec 29, 2022 •

edited

richard-churchman commented Jan 2, 2023

richard-churchman commented Jan 3, 2023

richard-churchman commented Dec 26, 2023 •

edited

Model Wizard #4

Model Wizard #4

Comments

richard-churchman commented Dec 29, 2022 • edited

Task: Ensure JSON Path Expression returns a Boolean value

Task: Create a new page to parse the CSV file

Task: Allocate Dependent Variable

Task: Create Model

Task: Load Data from CSV into JSON for storage in the Archive

Task: Synchronise Model

richard-churchman commented Jan 2, 2023

richard-churchman commented Jan 3, 2023

richard-churchman commented Dec 26, 2023 • edited

richard-churchman commented Dec 29, 2022 •

edited

richard-churchman commented Dec 26, 2023 •

edited