This commit is contained in:
Thomas Pelletier
2021-05-08 17:03:51 -04:00
committed by GitHub
parent ea225df3ed
commit 45ea20024b
8 changed files with 476 additions and 58 deletions
+334 -40
View File
@@ -1,57 +1,351 @@
# go-toml V2
# go-toml v2
Development branch. Use at your own risk.
Go library for the [TOML](https://toml.io/en/) format.
[👉 Discussion on github](https://github.com/pelletier/go-toml/discussions/471).
This library supports [TOML v1.0.0](https://toml.io/en/v1.0.0).
* `toml.Unmarshal()` should work as well as v1.
## Must do
## Development status
### Unmarshal
This is the upcoming major version of go-toml. It is currently in active
development. As of release v2.0.0-beta.1, the library has reached feature parity
with v1, and fixes a lot known bugs and performance issues along the way.
- [x] Unmarshal into maps.
- [x] Support Array Tables.
- [x] Unmarshal into pointers.
- [x] Support Date / times.
- [x] Support struct tags annotations.
- [x] Support Arrays.
- [x] Support Unmarshaler interface.
- [x] Original go-toml unmarshal tests pass.
- [x] Benchmark!
- [x] Abstract AST.
- [x] Original go-toml testgen tests pass.
- [x] Track file position (line, column) for errors.
- [x] Strict mode.
- [ ] Document Unmarshal / Decode
If you do not need the advanced document editing features of v1, you are
encouraged to try out this version.
### Marshal
👉 [Roadmap for v2](https://github.com/pelletier/go-toml/discussions/506).
- [x] Minimal implementation
- [x] Multiline strings
- [ ] Multiline arrays
- [ ] `inline` tag for tables
- [ ] Optional indentation
- [ ] Option to pick default quotes
### Document
## Documentation
- [ ] Gather requirements and design API.
Full API, examples, and implementation notes are available in the Go documentation.
## Ideas
[![Go Reference](https://pkg.go.dev/badge/github.com/pelletier/go-toml/v2.svg)](https://pkg.go.dev/github.com/pelletier/go-toml/v2)
- [ ] Allow types to implement a `ASTUnmarshaler` interface to unmarshal
straight from the AST?
- [x] Rewrite AST to use a single array as storage instead of one allocation per
node.
- [ ] Provide "minimal allocations" option that uses `unsafe` to reuse the input
byte array as storage for strings.
- [x] Cache reflection operations per type.
- [ ] Optimize tracker pass.
## Differences with v1
## Import
* [unmarshal](https://github.com/pelletier/go-toml/discussions/488)
```go
import "github.com/pelletier/go-toml/v2"
```
## Features
### Stdlib behavior
As much as possible, this library is designed to behave similarly as the
standard library's `encoding/json`.
### Performance
While go-toml favors usability, it is written with performance in mind. Most
operations should not be shockingly slow.
### Strict mode
`Decoder` can be set to "strict mode", which makes it error when some parts of
the TOML document was not prevent in the target structure. This is a great way
to check for typos. [See example in the documentation][strict].
[strict]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#example-Decoder.SetStrict
### Contextualized errors
When decoding errors occur, go-toml returns [`DecodeError`][decode-err]), which
contains a human readable contextualized version of the error. For example:
```
2| key1 = "value1"
3| key2 = "missing2"
| ~~~~ missing field
4| key3 = "missing3"
5| key4 = "value4"
```
[decode-err]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#DecodeError
### Local date and time support
TOML supports native [local date/times][ldt]. It allows to represent a given
date, time, or date-time without relation to a timezone or offset. To support
this use-case, go-toml provides [`LocalDate`][tld], [`LocalTime`][tlt], and
[`LocalDateTime`][tldt]. Those types can be transformed to and from `time.Time`,
making them convenient yet unambiguous structures for their respective TOML
representation.
[ldt]: https://toml.io/en/v1.0.0#local-date-time
[tld]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalDate
[tlt]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalTime
[tldt]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalDateTime
## Getting started
Given the following struct, let's see how to read it and write it as TOML:
```go
type MyConfig struct {
Version int
Name string
Tags []string
}
```
### Unmarshaling
[`Unmarshal`][unmarshal] reads a TOML document and fills a Go structure with its
content. For example:
```go
doc := `
version = 2
name = "go-toml"
tags = ["go", "toml"]
`
var cfg MyConfig
err := toml.Unmarshal([]byte(doc), &cfg)
if err != nil {
panic(err)
}
fmt.Println("version:", cfg.Version)
fmt.Println("name:", cfg.Name)
fmt.Println("tags:", cfg.Tags)
// Output:
// version: 2
// name: go-toml
// tags: [go toml]
```
[unmarshal]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Unmarshal
### Marshaling
[`Marshal`][marshal] is the opposite of Unmarshal: it represents a Go structure
as a TOML document:
```go
cfg := MyConfig{
Version: 2,
Name: "go-toml",
Tags: []string{"go", "toml"},
}
b, err := toml.Marshal(cfg)
if err != nil {
panic(err)
}
fmt.Println(string(b))
// Output:
// Version = 2
// Name = 'go-toml'
// Tags = ['go', 'toml']
```
[marshal]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Marshal
## Migrating from v1
This section describes the differences between v1 and v2, with some pointers on
how to get the original behavior when possible.
### Decoding / Unmarshal
#### Automatic field name guessing
When unmarshaling to a struct, if a key in the TOML document does not exactly
match the name of a struct field or any of the `toml`-tagged field, v1 tries
multiple variations of the key ([code][v1-keys]).
V2 instead does a case-insensitive matching, like `encoding/json`.
This could impact you if you are relying on casing to differentiate two fields,
and one of them is a not using the `toml` struct tag. The recommended solution
is to be specific about tag names for those fields using the `toml` struct tag.
[v1-keys]: https://github.com/pelletier/go-toml/blob/a2e52561804c6cd9392ebf0048ca64fe4af67a43/marshal.go#L775-L781
#### Ignore preexisting value in interface
When decoding into a non-nil `interface{}`, go-toml v1 uses the type of the
element in the interface to decode the object. For example:
```go
type inner struct {
B interface{}
}
type doc struct {
A interface{}
}
d := doc{
A: inner{
B: "Before",
},
}
data := `
[A]
B = "After"
`
toml.Unmarshal([]byte(data), &d)
fmt.Printf("toml v1: %#v\n", d)
// toml v1: main.doc{A:main.inner{B:"After"}}
```
In this case, field `A` is of type `interface{}`, containing a `inner` struct.
V1 sees that type and uses it when decoding the object.
When decoding an object into an `interface{}`, V2 instead disregards whatever
value the `interface{}` may contain and replaces it with a
`map[string]interface{}`. With the same data structure as above, here is what
the result looks like:
```go
toml.Unmarshal([]byte(data), &d)
fmt.Printf("toml v2: %#v\n", d)
// toml v2: main.doc{A:map[string]interface {}{"B":"After"}}
```
This is to match `encoding/json`'s behavior. There is no way to make the v2
decoder behave like v1.
#### Values out of array bounds ignored
When decoding into an array, v1 returns an error when the number of elements
contained in the doc is superior to the capacity of the array. For example:
```go
type doc struct {
A [2]string
}
d := doc{}
err := toml.Unmarshal([]byte(`A = ["one", "two", "many"]`), &d)
fmt.Println(err)
// (1, 1): unmarshal: TOML array length (3) exceeds destination array length (2)
```
In the same situation, v2 ignores the last value:
```go
err := toml.Unmarshal([]byte(`A = ["one", "two", "many"]`), &d)
fmt.Println("err:", err, "d:", d)
// err: <nil> d: {[one two]}
```
This is to match `encoding/json`'s behavior. There is no way to make the v2
decoder behave like v1.
#### Support for `toml.Unmarshaler` has been dropped
This method was not widely used, poorly defined, and added a lot of complexity.
A similar effect can be achieved by implementing the `encoding.TextUnmarshaler`
interface and use strings.
### Encoding / Marshal
#### Default struct fields order
V1 emits struct fields order alphabetically by default. V2 struct fields are
emitted in order they are defined. For example:
```go
type S struct {
B string
A string
}
data := S{
B: "B",
A: "A",
}
b, _ := tomlv1.Marshal(data)
fmt.Println("v1:\n" + string(b))
b, _ = tomlv2.Marshal(data)
fmt.Println("v2:\n" + string(b))
// Output:
// v1:
// A = "A"
// B = "B"
// v2:
// B = 'B'
// A = 'A'
```
There is no way to make v2 encoder behave like v1. A workaround could be to
manually sort the fields alphabetically in the struct definition.
#### No indentation by default
V1 automatically indents content of tables by default. V2 does not. However the
same behavior can be obtained using [`Encoder.SetIndentTables`][sit]. For example:
```go
data := map[string]interface{}{
"table": map[string]string{
"key": "value",
},
}
b, _ := tomlv1.Marshal(data)
fmt.Println("v1:\n" + string(b))
b, _ = tomlv2.Marshal(data)
fmt.Println("v2:\n" + string(b))
buf := bytes.Buffer{}
enc := tomlv2.NewEncoder(&buf)
enc.SetIndentTables(true)
enc.Encode(data)
fmt.Println("v2 Encoder:\n" + string(buf.Bytes()))
// Output:
// v1:
//
// [table]
// key = "value"
//
// v2:
// [table]
// key = 'value'
//
//
// v2 Encoder:
// [table]
// key = 'value'
```
[sit]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Encoder.SetIndentTables
#### Keys and strings are single quoted
V1 always uses double quotes (`"`) around strings and keys that cannot be
represented bare (unquoted). V2 uses single quotes instead by default (`'`),
unless a character cannot be represented, then falls back to double quotes.
There is no way to make v2 encoder behave like v1.
#### `TextMarshaler` emits as a string, not TOML
Types that implement [`encoding.TextMarshaler`][tm] can emit arbitrary TOML in
v1. The encoder would append the result to the output directly. In v2 the result
is wrapped in a string. As a result, this interface cannot be implemented by the
root object.
There is no way to make v2 encoder behave like v1.
[tm]: https://golang.org/pkg/encoding/#TextMarshaler
## License
+4
View File
@@ -0,0 +1,4 @@
/*
Package toml is a library to read and write TOML documents.
*/
package toml
+3 -2
View File
@@ -90,12 +90,13 @@ func (e *DecodeError) Position() (row int, column int) {
return e.line, e.column
}
// Key that was being processed when the error occurred.
// Key that was being processed when the error occurred. The key is present only
// if this DecodeError is part of a StrictMissingError.
func (e *DecodeError) Key() Key {
return e.key
}
// decodeErrorFromHighlight creates a DecodeError referencing to a highlighted
// decodeErrorFromHighlight creates a DecodeError referencing a highlighted
// range of bytes from document.
//
// highlight needs to be a sub-slice of document, or this function panics.
+21
View File
@@ -3,6 +3,7 @@ package toml
import (
"bytes"
"errors"
"fmt"
"strings"
"testing"
@@ -179,3 +180,23 @@ line 5`,
})
}
}
func ExampleDecodeError() {
doc := `name = 123__456`
s := map[string]interface{}{}
err := Unmarshal([]byte(doc), &s)
fmt.Println(err)
de := err.(*DecodeError)
fmt.Println(de.String())
row, col := de.Position()
fmt.Println("error occured at row", row, "column", col)
// Output:
// toml: number must have at least one digit between underscores
// 1| name = 123__456
// | ~~ number must have at least one digit between underscores
// error occured at row 1 column 11
}
+28 -14
View File
@@ -50,7 +50,9 @@ func NewEncoder(w io.Writer) *Encoder {
// SetTablesInline forces the encoder to emit all tables inline.
//
// This behavior can be controlled on an individual struct field basis with the
// `inline="true"` tag.
// inline tag:
//
// MyField `inline:"true"`
func (enc *Encoder) SetTablesInline(inline bool) {
enc.tablesInline = inline
}
@@ -58,8 +60,9 @@ func (enc *Encoder) SetTablesInline(inline bool) {
// SetArraysMultiline forces the encoder to emit all arrays with one element per
// line.
//
// This behavior can be controlled on an individual struct field basis with the
// `multiline="true"` tag.
// This behavior can be controlled on an individual struct field basis with the multiline tag:
//
// MyField `multiline:"true"`
func (enc *Encoder) SetArraysMultiline(multiline bool) {
enc.arraysMultiline = multiline
}
@@ -80,31 +83,42 @@ func (enc *Encoder) SetIndentTables(indent bool) {
//
// If v cannot be represented to TOML it returns an error.
//
// Encoding rules:
// Encoding rules
//
// 1. A top level slice containing only maps or structs is encoded as [[table
// A top level slice containing only maps or structs is encoded as [[table
// array]].
//
// 2. All slices not matching rule 1 are encoded as [array]. As a result, any
// map or struct they contain is encoded as an {inline table}.
// All slices not matching rule 1 are encoded as [array]. As a result, any map
// or struct they contain is encoded as an {inline table}.
//
// 3. Nil interfaces and nil pointers are not supported.
// Nil interfaces and nil pointers are not supported.
//
// 4. Keys in key-values always have one part.
// Keys in key-values always have one part.
//
// 5. Intermediate tables are always printed.
// Intermediate tables are always printed.
//
// By default, strings are encoded as literal string, unless they contain either
// a newline character or a single quote. In that case they are emitted as quoted
// strings.
//
// When encoding structs, fields are encoded in order of definition, with their
// exact name. The following struct tags are available:
// exact name.
//
// `toml:"foo"`: changes the name of the key to use for the field to foo.
// Struct tags
//
// `multiline:"true"`: when the field contains a string, it will be emitted as
// a quoted multi-line TOML string.
// The following struct tags are available to tweak encoding on a per-field
// basis:
//
// toml:"foo"
// Changes the name of the key to use for the field to foo.
//
// multiline:"true"
// When the field contains a string, it will be emitted as a quoted
// multi-line TOML string.
//
// inline:"true"
// When the field would normally be encoded as a table, it is instead
// encoded as an inline table.
func (enc *Encoder) Encode(v interface{}) error {
var (
b []byte
+25
View File
@@ -511,3 +511,28 @@ func TestIssue424(t *testing.T) {
require.NoError(t, err)
require.Equal(t, msg2, msg2parsed)
}
func ExampleMarshal() {
type MyConfig struct {
Version int
Name string
Tags []string
}
cfg := MyConfig{
Version: 2,
Name: "go-toml",
Tags: []string{"go", "toml"},
}
b, err := toml.Marshal(cfg)
if err != nil {
panic(err)
}
fmt.Println(string(b))
// Output:
// Version = 2
// Name = 'go-toml'
// Tags = ['go', 'toml']
}
+33 -2
View File
@@ -14,6 +14,9 @@ import (
"github.com/pelletier/go-toml/v2/internal/unsafe"
)
// Unmarshal deserializes a TOML document into a Go value.
//
// It is a shortcut for Decoder.Decode() with the default options.
func Unmarshal(data []byte, v interface{}) error {
p := parser{}
p.Reset(data)
@@ -48,11 +51,39 @@ func (d *Decoder) SetStrict(strict bool) {
// Decode the whole content of r into v.
//
// When a TOML local date is decoded into a time.Time, its value is represented
// in time.Local timezone.
// By default, values in the document that don't exist in the target Go value
// are ignored. See Decoder.SetStrict() to change this behavior.
//
// When a TOML local date, time, or date-time is decoded into a time.Time, its
// value is represented in time.Local timezone. Otherwise the approriate Local*
// structure is used.
//
// Empty tables decoded in an interface{} create an empty initialized
// map[string]interface{}.
//
// Types implementing the encoding.TextUnmarshaler interface are decoded from a
// TOML string.
//
// When decoding a number, go-toml will return an error if the number is out of
// bounds for the target type (which includes negative numbers when decoding
// into an unsigned int).
//
// Type mapping
//
// List of supported TOML types and their associated accepted Go types:
//
// String -> string
// Integer -> uint*, int*, depending on size
// Float -> float*, depending on size
// Boolean -> bool
// Offset Date-Time -> time.Time
// Local Date-time -> LocalDateTime, time.Time
// Local Date -> LocalDate, time.Time
// Local Time -> LocalTime, time.Time
// Array -> slice and array, depending on elements types
// Table -> map and struct
// Inline Table -> same as Table
// Array of Tables -> same as Array and Table
func (d *Decoder) Decode(v interface{}) error {
b, err := ioutil.ReadAll(d.r)
if err != nil {
+28
View File
@@ -1321,3 +1321,31 @@ key3 = "value3"
// | ~~~~ missing field
// 4| key3 = "value3"
}
func ExampleUnmarshal() {
type MyConfig struct {
Version int
Name string
Tags []string
}
doc := `
version = 2
name = "go-toml"
tags = ["go", "toml"]
`
var cfg MyConfig
err := toml.Unmarshal([]byte(doc), &cfg)
if err != nil {
panic(err)
}
fmt.Println("version:", cfg.Version)
fmt.Println("name:", cfg.Name)
fmt.Println("tags:", cfg.Tags)
// Output:
// version: 2
// name: go-toml
// tags: [go toml]
}