diff --git a/README.md b/README.md index ee6a3ea..876f31e 100644 --- a/README.md +++ b/README.md @@ -1,57 +1,351 @@ -# go-toml V2 +# go-toml v2 -Development branch. Use at your own risk. +Go library for the [TOML](https://toml.io/en/) format. -[👉 Discussion on github](https://github.com/pelletier/go-toml/discussions/471). +This library supports [TOML v1.0.0](https://toml.io/en/v1.0.0). -* `toml.Unmarshal()` should work as well as v1. -## Must do +## Development status -### Unmarshal +This is the upcoming major version of go-toml. It is currently in active +development. As of release v2.0.0-beta.1, the library has reached feature parity +with v1, and fixes a lot known bugs and performance issues along the way. -- [x] Unmarshal into maps. -- [x] Support Array Tables. -- [x] Unmarshal into pointers. -- [x] Support Date / times. -- [x] Support struct tags annotations. -- [x] Support Arrays. -- [x] Support Unmarshaler interface. -- [x] Original go-toml unmarshal tests pass. -- [x] Benchmark! -- [x] Abstract AST. -- [x] Original go-toml testgen tests pass. -- [x] Track file position (line, column) for errors. -- [x] Strict mode. -- [ ] Document Unmarshal / Decode +If you do not need the advanced document editing features of v1, you are +encouraged to try out this version. -### Marshal +👉 [Roadmap for v2](https://github.com/pelletier/go-toml/discussions/506). -- [x] Minimal implementation -- [x] Multiline strings -- [ ] Multiline arrays -- [ ] `inline` tag for tables -- [ ] Optional indentation -- [ ] Option to pick default quotes -### Document +## Documentation -- [ ] Gather requirements and design API. +Full API, examples, and implementation notes are available in the Go documentation. -## Ideas +[![Go Reference](https://pkg.go.dev/badge/github.com/pelletier/go-toml/v2.svg)](https://pkg.go.dev/github.com/pelletier/go-toml/v2) -- [ ] Allow types to implement a `ASTUnmarshaler` interface to unmarshal - straight from the AST? -- [x] Rewrite AST to use a single array as storage instead of one allocation per - node. -- [ ] Provide "minimal allocations" option that uses `unsafe` to reuse the input - byte array as storage for strings. -- [x] Cache reflection operations per type. -- [ ] Optimize tracker pass. -## Differences with v1 +## Import -* [unmarshal](https://github.com/pelletier/go-toml/discussions/488) +```go +import "github.com/pelletier/go-toml/v2" +``` + +## Features + +### Stdlib behavior + +As much as possible, this library is designed to behave similarly as the +standard library's `encoding/json`. + +### Performance + +While go-toml favors usability, it is written with performance in mind. Most +operations should not be shockingly slow. + +### Strict mode + +`Decoder` can be set to "strict mode", which makes it error when some parts of +the TOML document was not prevent in the target structure. This is a great way +to check for typos. [See example in the documentation][strict]. + +[strict]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#example-Decoder.SetStrict + +### Contextualized errors + +When decoding errors occur, go-toml returns [`DecodeError`][decode-err]), which +contains a human readable contextualized version of the error. For example: + +``` +2| key1 = "value1" +3| key2 = "missing2" + | ~~~~ missing field +4| key3 = "missing3" +5| key4 = "value4" +``` + +[decode-err]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#DecodeError + +### Local date and time support + +TOML supports native [local date/times][ldt]. It allows to represent a given +date, time, or date-time without relation to a timezone or offset. To support +this use-case, go-toml provides [`LocalDate`][tld], [`LocalTime`][tlt], and +[`LocalDateTime`][tldt]. Those types can be transformed to and from `time.Time`, +making them convenient yet unambiguous structures for their respective TOML +representation. + +[ldt]: https://toml.io/en/v1.0.0#local-date-time +[tld]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalDate +[tlt]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalTime +[tldt]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalDateTime + +## Getting started + +Given the following struct, let's see how to read it and write it as TOML: + +```go +type MyConfig struct { + Version int + Name string + Tags []string +} +``` + +### Unmarshaling + +[`Unmarshal`][unmarshal] reads a TOML document and fills a Go structure with its +content. For example: + +```go +doc := ` +version = 2 +name = "go-toml" +tags = ["go", "toml"] +` + +var cfg MyConfig +err := toml.Unmarshal([]byte(doc), &cfg) +if err != nil { + panic(err) +} +fmt.Println("version:", cfg.Version) +fmt.Println("name:", cfg.Name) +fmt.Println("tags:", cfg.Tags) + +// Output: +// version: 2 +// name: go-toml +// tags: [go toml] +``` + +[unmarshal]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Unmarshal + +### Marshaling + +[`Marshal`][marshal] is the opposite of Unmarshal: it represents a Go structure +as a TOML document: + +```go +cfg := MyConfig{ + Version: 2, + Name: "go-toml", + Tags: []string{"go", "toml"}, +} + +b, err := toml.Marshal(cfg) +if err != nil { + panic(err) +} +fmt.Println(string(b)) + +// Output: +// Version = 2 +// Name = 'go-toml' +// Tags = ['go', 'toml'] +``` + +[marshal]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Marshal + +## Migrating from v1 + +This section describes the differences between v1 and v2, with some pointers on +how to get the original behavior when possible. + +### Decoding / Unmarshal + +#### Automatic field name guessing + +When unmarshaling to a struct, if a key in the TOML document does not exactly +match the name of a struct field or any of the `toml`-tagged field, v1 tries +multiple variations of the key ([code][v1-keys]). + +V2 instead does a case-insensitive matching, like `encoding/json`. + +This could impact you if you are relying on casing to differentiate two fields, +and one of them is a not using the `toml` struct tag. The recommended solution +is to be specific about tag names for those fields using the `toml` struct tag. + +[v1-keys]: https://github.com/pelletier/go-toml/blob/a2e52561804c6cd9392ebf0048ca64fe4af67a43/marshal.go#L775-L781 + +#### Ignore preexisting value in interface + +When decoding into a non-nil `interface{}`, go-toml v1 uses the type of the +element in the interface to decode the object. For example: + +```go +type inner struct { + B interface{} +} +type doc struct { + A interface{} +} + +d := doc{ + A: inner{ + B: "Before", + }, +} + +data := ` +[A] +B = "After" +` + +toml.Unmarshal([]byte(data), &d) +fmt.Printf("toml v1: %#v\n", d) + +// toml v1: main.doc{A:main.inner{B:"After"}} +``` + +In this case, field `A` is of type `interface{}`, containing a `inner` struct. +V1 sees that type and uses it when decoding the object. + +When decoding an object into an `interface{}`, V2 instead disregards whatever +value the `interface{}` may contain and replaces it with a +`map[string]interface{}`. With the same data structure as above, here is what +the result looks like: + +```go +toml.Unmarshal([]byte(data), &d) +fmt.Printf("toml v2: %#v\n", d) + +// toml v2: main.doc{A:map[string]interface {}{"B":"After"}} +``` + +This is to match `encoding/json`'s behavior. There is no way to make the v2 +decoder behave like v1. + +#### Values out of array bounds ignored + +When decoding into an array, v1 returns an error when the number of elements +contained in the doc is superior to the capacity of the array. For example: + +```go +type doc struct { + A [2]string +} +d := doc{} +err := toml.Unmarshal([]byte(`A = ["one", "two", "many"]`), &d) +fmt.Println(err) + +// (1, 1): unmarshal: TOML array length (3) exceeds destination array length (2) +``` + +In the same situation, v2 ignores the last value: + +```go +err := toml.Unmarshal([]byte(`A = ["one", "two", "many"]`), &d) +fmt.Println("err:", err, "d:", d) +// err: d: {[one two]} +``` + +This is to match `encoding/json`'s behavior. There is no way to make the v2 +decoder behave like v1. + +#### Support for `toml.Unmarshaler` has been dropped + +This method was not widely used, poorly defined, and added a lot of complexity. +A similar effect can be achieved by implementing the `encoding.TextUnmarshaler` +interface and use strings. + +### Encoding / Marshal + +#### Default struct fields order + +V1 emits struct fields order alphabetically by default. V2 struct fields are +emitted in order they are defined. For example: + +```go +type S struct { + B string + A string +} + +data := S{ + B: "B", + A: "A", +} + +b, _ := tomlv1.Marshal(data) +fmt.Println("v1:\n" + string(b)) + +b, _ = tomlv2.Marshal(data) +fmt.Println("v2:\n" + string(b)) + +// Output: +// v1: +// A = "A" +// B = "B" + +// v2: +// B = 'B' +// A = 'A' +``` + +There is no way to make v2 encoder behave like v1. A workaround could be to +manually sort the fields alphabetically in the struct definition. + +#### No indentation by default + +V1 automatically indents content of tables by default. V2 does not. However the +same behavior can be obtained using [`Encoder.SetIndentTables`][sit]. For example: + + +```go +data := map[string]interface{}{ + "table": map[string]string{ + "key": "value", + }, +} + +b, _ := tomlv1.Marshal(data) +fmt.Println("v1:\n" + string(b)) + +b, _ = tomlv2.Marshal(data) +fmt.Println("v2:\n" + string(b)) + +buf := bytes.Buffer{} +enc := tomlv2.NewEncoder(&buf) +enc.SetIndentTables(true) +enc.Encode(data) +fmt.Println("v2 Encoder:\n" + string(buf.Bytes())) + +// Output: +// v1: +// +// [table] +// key = "value" +// +// v2: +// [table] +// key = 'value' +// +// +// v2 Encoder: +// [table] +// key = 'value' +``` + +[sit]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Encoder.SetIndentTables + +#### Keys and strings are single quoted + +V1 always uses double quotes (`"`) around strings and keys that cannot be +represented bare (unquoted). V2 uses single quotes instead by default (`'`), +unless a character cannot be represented, then falls back to double quotes. + +There is no way to make v2 encoder behave like v1. + +#### `TextMarshaler` emits as a string, not TOML + +Types that implement [`encoding.TextMarshaler`][tm] can emit arbitrary TOML in +v1. The encoder would append the result to the output directly. In v2 the result +is wrapped in a string. As a result, this interface cannot be implemented by the +root object. + +There is no way to make v2 encoder behave like v1. + +[tm]: https://golang.org/pkg/encoding/#TextMarshaler ## License diff --git a/doc.go b/doc.go new file mode 100644 index 0000000..d541e4b --- /dev/null +++ b/doc.go @@ -0,0 +1,4 @@ +/* + Package toml is a library to read and write TOML documents. +*/ +package toml diff --git a/errors.go b/errors.go index 82c21ef..4486505 100644 --- a/errors.go +++ b/errors.go @@ -90,12 +90,13 @@ func (e *DecodeError) Position() (row int, column int) { return e.line, e.column } -// Key that was being processed when the error occurred. +// Key that was being processed when the error occurred. The key is present only +// if this DecodeError is part of a StrictMissingError. func (e *DecodeError) Key() Key { return e.key } -// decodeErrorFromHighlight creates a DecodeError referencing to a highlighted +// decodeErrorFromHighlight creates a DecodeError referencing a highlighted // range of bytes from document. // // highlight needs to be a sub-slice of document, or this function panics. diff --git a/errors_test.go b/errors_test.go index efbb59a..893dc9c 100644 --- a/errors_test.go +++ b/errors_test.go @@ -3,6 +3,7 @@ package toml import ( "bytes" "errors" + "fmt" "strings" "testing" @@ -179,3 +180,23 @@ line 5`, }) } } + +func ExampleDecodeError() { + doc := `name = 123__456` + + s := map[string]interface{}{} + err := Unmarshal([]byte(doc), &s) + + fmt.Println(err) + + de := err.(*DecodeError) + fmt.Println(de.String()) + + row, col := de.Position() + fmt.Println("error occured at row", row, "column", col) + // Output: + // toml: number must have at least one digit between underscores + // 1| name = 123__456 + // | ~~ number must have at least one digit between underscores + // error occured at row 1 column 11 +} diff --git a/marshaler.go b/marshaler.go index aa8783a..713d8af 100644 --- a/marshaler.go +++ b/marshaler.go @@ -50,7 +50,9 @@ func NewEncoder(w io.Writer) *Encoder { // SetTablesInline forces the encoder to emit all tables inline. // // This behavior can be controlled on an individual struct field basis with the -// `inline="true"` tag. +// inline tag: +// +// MyField `inline:"true"` func (enc *Encoder) SetTablesInline(inline bool) { enc.tablesInline = inline } @@ -58,8 +60,9 @@ func (enc *Encoder) SetTablesInline(inline bool) { // SetArraysMultiline forces the encoder to emit all arrays with one element per // line. // -// This behavior can be controlled on an individual struct field basis with the -// `multiline="true"` tag. +// This behavior can be controlled on an individual struct field basis with the multiline tag: +// +// MyField `multiline:"true"` func (enc *Encoder) SetArraysMultiline(multiline bool) { enc.arraysMultiline = multiline } @@ -80,31 +83,42 @@ func (enc *Encoder) SetIndentTables(indent bool) { // // If v cannot be represented to TOML it returns an error. // -// Encoding rules: +// Encoding rules // -// 1. A top level slice containing only maps or structs is encoded as [[table +// A top level slice containing only maps or structs is encoded as [[table // array]]. // -// 2. All slices not matching rule 1 are encoded as [array]. As a result, any -// map or struct they contain is encoded as an {inline table}. +// All slices not matching rule 1 are encoded as [array]. As a result, any map +// or struct they contain is encoded as an {inline table}. // -// 3. Nil interfaces and nil pointers are not supported. +// Nil interfaces and nil pointers are not supported. // -// 4. Keys in key-values always have one part. +// Keys in key-values always have one part. // -// 5. Intermediate tables are always printed. +// Intermediate tables are always printed. // // By default, strings are encoded as literal string, unless they contain either // a newline character or a single quote. In that case they are emitted as quoted // strings. // // When encoding structs, fields are encoded in order of definition, with their -// exact name. The following struct tags are available: +// exact name. // -// `toml:"foo"`: changes the name of the key to use for the field to foo. +// Struct tags // -// `multiline:"true"`: when the field contains a string, it will be emitted as -// a quoted multi-line TOML string. +// The following struct tags are available to tweak encoding on a per-field +// basis: +// +// toml:"foo" +// Changes the name of the key to use for the field to foo. +// +// multiline:"true" +// When the field contains a string, it will be emitted as a quoted +// multi-line TOML string. +// +// inline:"true" +// When the field would normally be encoded as a table, it is instead +// encoded as an inline table. func (enc *Encoder) Encode(v interface{}) error { var ( b []byte diff --git a/marshaler_test.go b/marshaler_test.go index 470bead..d719406 100644 --- a/marshaler_test.go +++ b/marshaler_test.go @@ -511,3 +511,28 @@ func TestIssue424(t *testing.T) { require.NoError(t, err) require.Equal(t, msg2, msg2parsed) } + +func ExampleMarshal() { + type MyConfig struct { + Version int + Name string + Tags []string + } + + cfg := MyConfig{ + Version: 2, + Name: "go-toml", + Tags: []string{"go", "toml"}, + } + + b, err := toml.Marshal(cfg) + if err != nil { + panic(err) + } + fmt.Println(string(b)) + + // Output: + // Version = 2 + // Name = 'go-toml' + // Tags = ['go', 'toml'] +} diff --git a/unmarshaler.go b/unmarshaler.go index 3864be6..3d25a86 100644 --- a/unmarshaler.go +++ b/unmarshaler.go @@ -14,6 +14,9 @@ import ( "github.com/pelletier/go-toml/v2/internal/unsafe" ) +// Unmarshal deserializes a TOML document into a Go value. +// +// It is a shortcut for Decoder.Decode() with the default options. func Unmarshal(data []byte, v interface{}) error { p := parser{} p.Reset(data) @@ -48,11 +51,39 @@ func (d *Decoder) SetStrict(strict bool) { // Decode the whole content of r into v. // -// When a TOML local date is decoded into a time.Time, its value is represented -// in time.Local timezone. +// By default, values in the document that don't exist in the target Go value +// are ignored. See Decoder.SetStrict() to change this behavior. +// +// When a TOML local date, time, or date-time is decoded into a time.Time, its +// value is represented in time.Local timezone. Otherwise the approriate Local* +// structure is used. // // Empty tables decoded in an interface{} create an empty initialized // map[string]interface{}. +// +// Types implementing the encoding.TextUnmarshaler interface are decoded from a +// TOML string. +// +// When decoding a number, go-toml will return an error if the number is out of +// bounds for the target type (which includes negative numbers when decoding +// into an unsigned int). +// +// Type mapping +// +// List of supported TOML types and their associated accepted Go types: +// +// String -> string +// Integer -> uint*, int*, depending on size +// Float -> float*, depending on size +// Boolean -> bool +// Offset Date-Time -> time.Time +// Local Date-time -> LocalDateTime, time.Time +// Local Date -> LocalDate, time.Time +// Local Time -> LocalTime, time.Time +// Array -> slice and array, depending on elements types +// Table -> map and struct +// Inline Table -> same as Table +// Array of Tables -> same as Array and Table func (d *Decoder) Decode(v interface{}) error { b, err := ioutil.ReadAll(d.r) if err != nil { diff --git a/unmarshaler_test.go b/unmarshaler_test.go index 9814cd1..6235727 100644 --- a/unmarshaler_test.go +++ b/unmarshaler_test.go @@ -1321,3 +1321,31 @@ key3 = "value3" // | ~~~~ missing field // 4| key3 = "value3" } + +func ExampleUnmarshal() { + type MyConfig struct { + Version int + Name string + Tags []string + } + + doc := ` + version = 2 + name = "go-toml" + tags = ["go", "toml"] + ` + + var cfg MyConfig + err := toml.Unmarshal([]byte(doc), &cfg) + if err != nil { + panic(err) + } + fmt.Println("version:", cfg.Version) + fmt.Println("name:", cfg.Name) + fmt.Println("tags:", cfg.Tags) + + // Output: + // version: 2 + // name: go-toml + // tags: [go toml] +}