Sometimes it is necessary to cleanse and standardize data to get more accurate profiling results. In this case, it is necessary to build a plan.
A plan file defines the logic and rules to be applied to the input data in order to produce the desired output. Plans are created by placing Steps onto a canvas and connecting them together. Steps are data processing algorithms that can be used to read, transform and analyze data, among other actions.
Examples of complex profiling plans are available in the Tutorials project in DQ Analyzer.
The Plan Editor
The image below shows a plan in the Plan Editor, which is launched every time you open or create a plan. The Plan Editor consists of the (1) Canvas, where the plan logic is defined (by connecting Steps together), and the (2) Palette, where the various steps and actions are listed.
Creating a Plan File
To create a new plan file:
- Select New > Plan by right-clicking on a project or folder in the explorer panel. Alternatively use the toolbar. Both options are shown below
- Specify the Name of the plan and the place (Container) for storing it.
Adding Steps to the Canvas
To add steps to the canvas, do one of the following:
- Drag needed steps from the Palette to the Canvas.
- Press CTRL + I or Insert and select the step from a filterable list.
To learn how particular steps work, go through plans in the tutorial project: DQ Projects > Tutorials.
To connect two steps, drag from the out endpoint of one step to the in endpoint of another step.
Editing Step Properties
Most steps require (or benefit from) some configuration to perform their functions, which is done by accessing the step properties.
To edit step properties, double-click the step or right-click it and select Edit Properties:
In the image below, a regular expression is defined in the Regex Matching step:
In the image below, the Column Assigner step is edited: a column is created and an expression is defined for it:
Press Ctrl+Space to get a list of available functions and inputs columns. Press Ctrl+Space+Space to get a list of available input columns.
Dealing with Errors
Errors that may arise when constructing the plan are be reported to the Properties tab of the Status Panel:
Selecting an individual step will show only the warnings and errors for that Step. Double-clicking on an error in the Properties panel will open the step properties dialog to the field which contains the error.
To add a comment to your plan to explain its logic, select Comment from the Palette and click anywhere on the Canvas.
To edit the comment, double-click on it. The image below shows the comment editor, which allows changing the text, background, and border color as well as the text itself:
Running the Plan
When the plan is built and contains no errors it can be run. To do that simply click the Run button as seen below:
When the plan is finished running, a message will appear:
Viewing the Console Output
During and after plan execution, you can see plan execution logs in the Console tab of the Status Panel:
Viewing the Plan Execution Progress
To open plan execution progress while the plan is being executed, click the Show Progress icon in the Status Panel.
A new tab that opens shows the total number of records passing to each step.
Viewing Historical Run Results
To view all plan executions in the current sessions, switch to the Run Results tab in the Status Panel and select a particular run. You will be able to review the errors that occurred.