Kwalify Users' Guide (for Ruby and Java)
last update: $Date: 2005-12-20 12:50:56 +0900 (Tue, 20 Dec 2005) $
Preface
Kwalify(*1) is a tiny schema validator for YAML and JSON document.
You know "80-20 rule" known as Pareto Law, don't you? This rule suggests that 20% of the population owns 80% of the wealth. Kwalify is based on a new "50-5 rule" which suggests that 5% of the population owns 50% of the wealth. This rule is more aggressive and cost-effective than Pareto Law. The rule is named as "Levi's Law".
schema technology | (A) cover range | (B) cost to pay | (A)/(B) effectiveness |
---|---|---|---|
XML Schema | 95% | 100% | 0.95 (= 95/100) |
RelaxNG | 80% | 20% | 4.0 (= 80/20) |
Kwalify | 50% | 5% | 10.0 (= 50/5) |
Kwalify is small and in fact poorer than RelaxNG or XML Schema. I hope you extend/customize Kwalify for your own way.
Table of Contents:- (*1)
- Pronounce as 'Qualify'.
Usage of Kwalify
Usage in Command-Line
### kwalify-ruby $ kwalify -f schema.yaml document.yaml [document2.yaml ...] ### kwalify-java $ java -classpath kwalify.jar kwalify.Main -f schema.yaml document.yaml [document2.yaml ...]
### kwalify-ruby $ kwalify -m schema.yaml [schema2.yaml ...] ### kwalify-java $ java -classpath kwalify.jar kwalify.Main -m schema.yaml [schema2.yaml ...]
Command-line options:
-
-h
,--help
- Print help message.
-
-v
- Print version.
-
-s
- Silent mode.
-
-f schema.yaml
- Specify schema definition file.
-
-m
- Meta-validation of schema definition.
-
-t
- Expand tab characters to spaces automatically.
-
-l
- Show linenumber on which error found.
-
-E
- Show errors in Emacs-compatible style (implies '-l' option).
Notice that the command-line option -l
is an experimental feature, for kwalify command use original YAML parser instead of Syck parser when this option is specified.
If you are an Emacs user, try -E
option that show errors in format which Emacs can parse and jump to errors.
You can use C-x `
(next-error) to jump into errors.
Usage in Ruby Script
The followings are example scripts for Ruby.
require 'kwalify' ## parse schema definition and create validator schema = YAML.load_file('schema.yaml') validator = Kwalify::Validator.new(schema) # raises Kwalify::SchemaError if wrong ## validate YAML document document = YAML.load_file('document.yaml') error_list = validator.validate(document) unless error_list.empty? error_list.each do |error| # error is instance of Kwalify::ValidationError puts "[#{error.path}] #{error.message}" end end
require 'kwalify' ## parse schema definition and create validator schema = YAML.load_file('schema.yaml') validator = Kwalify::Validator.new(schema) # raises Kwalify::SchemaError if wrong ## parse YAML document with Kwalify's parser str = File.read('document.yaml') parser = Kwalify::Parser.new(str) document = parser.parse() ## validate document and show errors error_list = validator.validate(document) unless error_list.empty? parser.set_errors_linenum(error_list) # set linenum on error error_list.sort.each do |error| puts "(line %d)[%s] %s" % [error.linenum, error.path, error.message] end end
Kwalify's YAML parser is experimental. You should notice that Kwalify's YAML parser is limited only for basic syntax of YAML.
The followings are example programs of Java.
import kwalify.*; public class Test { public static void main(String[] args) throws Exception { // read schema String schema_str = Util.readFile("schema.yaml"); Object schema = new YamlParser(schema_str).parse(); // read document file String document_str = Util.readFile("document.yaml"); YamlParser parser = new YamlParser(document_str); Object document = parser.parse(); // create validator and validate Validator validator = new Validator(schema); List errors = validator.validate(document); // show errors if (errors != null && errors.size() > 0) { parser.setErrorsLineNumber(errors); Collections.sort(errors); for (Iterator it = errors.iterator(); it.hasNext(); ) { ValidationException error = (ValidationException)it.next(); int linenum = error.getLineNumber(); String path = error.getPath(); String mesg = error.getMessage(); System.out.println("- " + linenum + ": [" + path + "] " + mesg); } } } }
Schema Definition
Sequence
schema01.yaml
: sequence of stringtype: seq sequence: - type: str
document01a.yaml
: valid document example- foo - bar - baz
$ kwalify -lf schema01.yaml document01a.yaml document01a.yaml#0: valid.
document01b.yaml
: invalid document example- foo - 123 - baz
$ kwalify -lf schema01.yaml document01b.yaml document01b.yaml#0: INVALID - (line 2) [/1] '123': not a string.
Default 'type:
' is str
so you can omit 'type: str
'.
Mapping
schema02.yaml
: mapping of scalartype: map mapping: name: type: str required: yes email: type: str pattern: /@/ age: type: int birth: type: date
document02a.yaml
: valid document examplename: foo email: foo@mail.com age: 20 birth: 1985-01-01
$ kwalify -lf schema02.yaml document02a.yaml document02a.yaml#0: valid.
document02b.yaml
: invalid document examplename: foo email: foo(at)mail.com age: twenty birth: Jun 01, 1985
$ kwalify -lf schema02.yaml document02b.yaml document02b.yaml#0: INVALID - (line 2) [/email] 'foo(at)mail.com': not matched to pattern /@/. - (line 3) [/age] 'twenty': not a integer. - (line 4) [/birth] 'Jun 01, 1985': not a date.
Sequence of Mapping
schema03.yaml
: sequence of mappingtype: seq sequence: - type: map mapping: name: type: str required: true email: type: str
document03a.yaml
: valid document example- name: foo email: foo@mail.com - name: bar email: bar@mail.net - name: baz email: baz@mail.org
$ kwalify -lf schema03.yaml document03a.yaml document03a.yaml#0: valid.
document03b.yaml
: invalid document example- name: foo email: foo@mail.com - naem: bar email: bar@mail.net - name: baz mail: baz@mail.org
$ kwalify -lf schema03.yaml document03b.yaml document03b.yaml#0: INVALID - (line 3) [/1] key 'name:' is required. - (line 3) [/1/naem] key 'naem:' is undefined. - (line 6) [/2/mail] key 'mail:' is undefined.
Mapping of Sequence
schema04.yaml
: mapping of sequence of mappingtype: map mapping: company: type: str required: yes email: type: str employees: type: seq sequence: - type: map mapping: code: type: int required: yes name: type: str required: yes email: type: str
document04a.yaml
: valid document examplecompany: Kuwata lab. email: webmaster@kuwata-lab.com employees: - code: 101 name: foo email: foo@kuwata-lab.com - code: 102 name: bar email: bar@kuwata-lab.com
$ kwalify -lf schema04.yaml document04a.yaml document04a.yaml#0: valid.
document04b.yaml
: invalid document examplecompany: Kuwata Lab. email: webmaster@kuwata-lab.com employees: - code: A101 name: foo email: foo@kuwata-lab.com - code: 102 name: bar mail: bar@kuwata-lab.com
$ kwalify -lf schema04.yaml document04b.yaml document04b.yaml#0: INVALID - (line 4) [/employees/0/code] 'A101': not a integer. - (line 9) [/employees/1/mail] key 'mail:' is undefined.
Rule and Entry
Rule is set of entries. Entry usually represents constraint outside of a few exceptions.
The followings are constraint entries.
-
required:
- Value is required when true (default is false).
-
enum:
- List of available values.
-
pattern:
- Specifies regular expression pattern of value.
-
type:
-
Type of value. The followings are available:
str
int
float
number
(== int or float)text
(== str or number)bool
date
time
timestamp
seq
map
scalar
(all but seq and map)any
(means any data)
-
range:
-
Range of value between max/max-ex and min/min-ex.
- 'max' means 'max-inclusive'.
- 'min' means 'min-inclusive'.
- 'max-ex' means 'max-exclusive'.
- 'min-ex' means 'min-exclusive'.
seq
,map
,bool
andany
are not available withrange:
. -
length:
-
Range of length of value between max/max-ex and min/min-ex. Only type
str
andtext
are available withlength:
. -
assert:
-
String which represents validation expression. String should contain variable name
val
which repsents value. (This is an experimental function and supported only Kwartz-ruby). -
unique:
- Value is unique for mapping or sequence. See the next subsection for detail.
The followings are non-constraint entries.
-
name:
- Name of schema.
-
desc:
- Description. This is not used for validation.
Rule contains 'type:' entry. 'sequence:' entry takes a list of rule. 'mapping:' entry takes a hash which values are rules.
schema05.yaml
: rule examplestype: seq # new rule sequence: - type: map # new rule mapping: name: type: str # new rule required: yes email: type: str # new rule required: yes pattern: /@/ password: type: text # new rule length: { max: 16, min: 8 } age: type: int # new rule range: { max: 30, min: 18 } # or assert: 18 <= val && val <= 30 blood: type: str # new rule enum: - A - B - O - AB birth: type: date # new rule memo: type: any # new rule
document05a.yaml
: valid document example- name: foo email: foo@mail.com password: xxx123456 age: 20 blood: A birth: 1985-01-01 - name: bar email: bar@mail.net age: 25 blood: AB birth: 1980-01-01
$ kwalify -lf schema05.yaml document05a.yaml document05a.yaml#0: valid.
document05b.yaml
: invalid document example- name: foo email: foo(at)mail.com password: xxx123 age: twenty blood: a birth: 1985-01-01 - given-name: bar family-name: Bar email: bar@mail.net age: 15 blood: AB birth: 1980/01/01
$ kwalify -lf schema05.yaml document05b.yaml document05b.yaml#0: INVALID - (line 2) [/0/email] 'foo(at)mail.com': not matched to pattern /@/. - (line 3) [/0/password] 'xxx123': too short (length 6 < min 8). - (line 4) [/0/age] 'twenty': not a integer. - (line 5) [/0/blood] 'a': invalid blood value. - (line 7) [/1/given-name] key 'given-name:' is undefined. - (line 7) [/1] key 'name:' is required. - (line 8) [/1/family-name] key 'family-name:' is undefined. - (line 10) [/1/age] '15': too small (< min 18). - (line 12) [/1/birth] '1980/01/01': not a date.
Unique constraint
'unique:
' constraint entry is available with elements of sequence or mapping.
This is equivalent to unique key or primary key of RDBMS.
Type of rule which has 'unique:
' entry must be scalar (str, int, float, ...).
Type of parent rule must be sequence or mapping.
schema06.yaml
: unique constraint entry with mapping and sequencetype: seq sequence: - type: map required: yes mapping: name: type: str required: yes unique: yes email: type: str groups: type: seq sequence: - type: str unique: yes
document06a.yaml
: valid document example- name: foo email: admin@mail.com groups: - users - foo - admin - name: bar email: admin@mail.com groups: - users - admin - name: baz email: baz@mail.com groups: - users
$ kwalify -lf schema06.yaml document06a.yaml document06a.yaml#0: valid.
document06b.yaml
: invalid document example- name: foo email: admin@mail.com groups: - foo - users - admin - foo - name: bar email: admin@mail.com groups: - admin - users - name: bar email: baz@mail.com groups: - users
$ kwalify -lf schema06.yaml document06b.yaml document06b.yaml#0: INVALID - (line 7) [/0/groups/3] 'foo': is already used at '/0/groups/0'. - (line 13) [/2/name] 'bar': is already used at '/1/name'.
Validator#validator_hook()
You can extend Kwalify::Validator class (Ruby) or kwalify.Validator class (Java), and override Kwalify::Validator#validator_hook() method (Ruby) or kwalify.Validator#validateHook() method (Java). This method is called by Kwalify::Validator#validate() (Ruby) or kwalify.Validator#validate() (Java).
type: map mapping: answers: type: seq sequence: - type: map name: Answer mapping: name: type: str required: yes answer: type: str required: yes enum: - good - not bad - bad reason: type: str
#!/usr/bin/env ruby require 'kwalify' require 'yaml' ## validator class for answers class AnswersValidator < Kwalify::Validator ## load schema definition @@schema = YAML.load_file('answers-schema.yaml') def initialize() super(@@schema) end ## hook method called by Validator#validate() def validate_hook(value, rule, path, errors) case rule.name when 'Answer' if value['answer'] == 'bad' reason = value['reason'] if !reason || reason.empty? msg = "reason is required when answer is 'bad'." errors << Kwalify::ValidationError.new(msg, path) end end end end end ## create validator validator = AnswersValidator.new ## load YAML document input = ARGF.read() document = YAML.load(input) ## validate errors = validator.validate(document) if errors.empty? puts "Valid." else puts "*** INVALID!" errors.each do |error| # error.class == Kwalify::ValidationError puts " - [#{error.path}] : #{error.message}" end end
document07a.yaml
: valid document exampleanswers: - name: Foo answer: good reason: I like this style. - name: Bar answer: not bad - name: Baz answer: bad reason: I don't like this style.
$ ruby answers-validator.rb document07a.yaml Valid.
document07b.yaml
: invalid document exampleanswers: - name: Foo answer: good - name: Bar answer: bad - name: Baz answer: not bad
$ ruby answers-validator.rb document07b.yaml *** INVALID! - [/answers/1] : reason is required when answer is 'bad'.
You can validate some document by a Validator instance because Validator class and Validator#validate() method are stateless. If you use instance variables in custom validator_hook() method, it becomes to be stateful.
Here is a Java program equivarent to 'answers-validator.rb'.
import kwalify.Validator; import kwalify.Rule; import kwalify.Util; import kwalify.YamlUtil; import kwalify.YamlParser; import kwalify.SyntaxException; import kwalify.ValidationException; import java.util.*; import java.io.IOException; /** * validator class for answers */ public class AnswersValidator extends Validator { /** schema string */ private static final String SCHEMA = "" + "type: map\n" + "mapping:\n" + " answers:\n" + " type: seq\n" + " sequence:\n" + " - type: map\n" + " name: Answer\n" + " mapping:\n" + " name:\n" + " type: str\n" + " required: yes\n" + " answer:\n" + " type: str\n" + " required: yes\n" + " enum:\n" + " - good\n" + " - not bad\n" + " - bad\n" + " reason:\n" + " type: str\n" ; /** schema object */ private static Map schema = null; static { try { schema = (Map)YamlUtil.load(SCHEMA); } catch (SyntaxException ex) { assert false; } } /** construnctor */ public AnswersValidator() { super(schema); } /** hook method called by Validator#validate() */ protected void validateHook(Object value, Rule rule, String path, List errors) { String rule_name = rule.getName(); if (rule_name != null && rule_name.equals("Answer")) { assert value instanceof Map; Map val = (Map)value; assert val.get("answer") != null; if (val.get("answer").equals("bad")) { String reason = (String)val.get("reason"); if (reason == null || reason.length() == 0) { String msg = "reason is required when answer is 'bad'."; errors.add(new ValidationException(msg, path)); } } } } /** main program */ public static void main(String[] args) throws IOException, SyntaxException { // create validator Validator validator = new AnswersValidator(); // load YAML document String input; if (args.length > 0) { input = Util.readFile(args[0]); } else { input = Util.readInputStream(System.in); } YamlParser parser = new YamlParser(input); Object document = parser.parse(); // validate and show errors List errors = validator.validate(document); if (errors == null || errors.size() == 0) { System.out.println("Valid."); } else { System.out.println("*** INVALID!"); parser.setErrorsLineNumber(errors); Collections.sort(errors); for (Iterator it = errors.iterator(); it.hasNext(); ) { ValidationException error = (ValidationException)it.next(); int linenum = error.getLineNumber(); String path = error.getPath(); String mesg = error.getMessage(); String s = "- line " + linenum + ": [" + path + "] " + mesg; System.out.println(s); } } } }
$ java -classpath kwalify.jar AnswersValidator document07a.yaml Valid. $ java -classpath kwalify.jar AnswersValidator document07b.yaml *** INVALID! - line 4: [/answers/1] reason is required when answer is 'bad'.
Validator with Block
Notice: This is an experimental feature.
Kwalify::Validator.new()
method can take a block which is invoked when validation.
validate08.rb
: validate script#!/usr/bin/env ruby require 'kwalify' require 'yaml' ## load schema definition schema = YAML.load_file('answers-schema.yaml') ## create validator for answers validator = Kwalify::Validator.new(schema) { |value, rule, path, errors| case rule.name when 'Answer' if value['answer'] == 'bad' reason = value['reason'] if !reason || reason.empty? msg = "reason is required when answer is 'bad'." errors << Kwalify::ValidationError.new(msg, path) end end end } ## load YAML document input = ARGF.read() document = YAML.load(input) ## validate errors = validator.validate(document) if errors.empty? puts "Valid." else puts "*** INVALID!" errors.each do |error| # error.class == Kwalify::ValidationError puts " - [#{error.path}] : #{error.message}" end end
$ ruby validate08.rb document07a.yaml Valid.
$ ruby validate08.rb document07b.yaml *** INVALID! - [/answers/1] : reason is required when answer is 'bad'.
Tips
Enclose Key Names in (Double) Quotes
It is allowed to enclose key name in quotes (') or double-quotes (") in YAML. This tip highlights user-defined key names.
schema11a.yaml
: enclosing in double-quotestype: map mapping: "name": required: yes "email": pattern: /@/ "age": type: int "birth": type: date
You may prefer to indent with 1 space and 3 spaces.
schema11b.yaml
: indent with 1 space and 3 spacestype: map mapping: "name": required: yes "email": pattern: /@/ "age": type: int "birth": type: date
JSON
JSON is a lightweight data-interchange format, especially useful for JavaScript. JSON can be considered as a subset of YAML. It means that YAML parser can parse JSON and Kwalify can validate JSON document.
schema12.yaml
: an example schema written in JSON format{ "type": "map", "required": true, "mapping": { "name": { "type": "str", "required": true }, "email": { "type": "str" }, "age": { "type": "int" }, "gender": { "type": "str", "enum": ["M", "F"] }, "favorite": { "type": "seq", "sequence": [ { "type": "str" } ] } } }
document12a.yaml
: valid JSON document example{ "name": "Foo", "email": "foo@mail.com", "age": 20, "gender": "F", "favorite": [ "football", "basketball", "baseball" ] }
$ kwalify -lf schema12.yaml document12a.yaml document12a.yaml#0: valid.
document12b.yaml
: invalid JSON document example{ "mail": "foo@mail.com", "age": twenty, "gender": "X", "favorite": [ 123, 456 ] }
$ kwalify -lf schema12.yaml document12b.yaml document12b.yaml#0: INVALID - (line 1) [/] key 'name:' is required. - (line 2) [/mail] key 'mail:' is undefined. - (line 3) [/age] 'twenty': not a integer. - (line 4) [/gender] 'X': invalid gender value. - (line 5) [/favorite/0] '123': not a string. - (line 5) [/favorite/1] '456': not a string.
Anchor
You can share schemas with YAML anchor.
schema13.yaml
: anchor exampletype: seq sequence: - &employee type: map mapping: "given-name": &name type: str required: yes "family-name": *name "post": enum: - exective - manager - clerk "supervisor": *employee
Anchor is also available in YAML document.
document13a.yaml
: valid document example- &foo given-name: foo family-name: Foo post: exective - &bar given-name: bar family-name: Bar post: manager supervisor: *foo - given-name: baz family-name: Baz post: clerk supervisor: *bar - given-name: zak family-name: Zak post: clerk supervisor: *bar
$ kwalify -lf schema13.yaml document13a.yaml document13a.yaml#0: valid.
Default of Mapping
YAML allows user to specify default value of mapping.
For example, the following YAML document uses default value of mapping.
A: 10 B: 20 =: -1 # default value
This is equal to the following Ruby code.
map = ["A"=>10, "B"=>20] map.default = -1 map
Kwalify allows user to specify default rule using default value of mapping. It is useful when key names are unknown.
schema14.yaml
: default rule exampletype: map mapping: =: # default rule type: number range: { max: 1, min: -1 }
document14a.yaml
: valid document examplevalue1: 0 value2: 0.5 value3: -0.9
$ kwalify -lf schema14.yaml document14a.yaml document14a.yaml#0: valid.
document14b.yaml
: invalid document examplevalue1: 0 value2: 1.1 value3: -2.0
$ kwalify -lf schema14.yaml document14b.yaml document14b.yaml#0: INVALID - (line 2) [/value2] '1.1': too large (> max 1). - (line 3) [/value3] '-2.0': too small (< min -1).
Merging Mappings
YAML allows user to merge mappings.
- &a1 A: 10 B: 20 - <<: *a1 # merge A: 15 # override C: 30 # add
This is equal to the following Ruby code.
a1 = {"A"=>10, "B"=>20} tmp = {} tmp.update(a1) # merge tmp["A"] = 15 # override tmp["C"] = 30 # add
This feature allows Kwalify to merge rule entries.
schema15.yaml
: merging rule entries exampletype: map mapping: "group": type: map mapping: "name": &name type: str required: yes "email": &email type: str pattern: /@/ required: no "user": type: map mapping: "name": <<: *name # merge length: { max: 16 } # override "email": <<: *email # merge required: yes # add
document15a.yaml
: valid document examplegroup: name: foo email: foo@mail.com user: name: bar email: bar@mail.com
$ kwalify -lf schema15.yaml document15a.yaml document15a.yaml#0: valid.
document15b.yaml
: invalid document examplegroup: name: foo email: foo@mail.com user: name: toooooo-looooong-name
$ kwalify -lf schema15.yaml document15b.yaml document15b.yaml#0: INVALID - (line 4) [/user] key 'email:' is required. - (line 5) [/user/name] 'toooooo-looooong-name': too long (length 21 > max 16).