Clarify security model with regard to pointer _count fields.

[apps/agl-service-can-low-level.git] / docs / concepts.rst
diff --git a/docs/concepts.rst b/docs/concepts.rst

index 122d29c..b4f657e 100644 (file)
--- a/docs/concepts.rst
+++ b/docs/concepts.rst
@@ -10,33 +10,40 @@ The things outlined here are the underlying concepts of the nanopb design.
  
  Proto files
  ===========
-All Protocol Buffers implementations use .proto files to describe the message format.
-The point of these files is to be a portable interface description language.
+All Protocol Buffers implementations use .proto files to describe the message
+format. The point of these files is to be a portable interface description
+language.
  
  Compiling .proto files for nanopb
  ---------------------------------
-Nanopb uses the Google's protoc compiler to parse the .proto file, and then a python script to generate the C header and source code from it::
+Nanopb uses the Google's protoc compiler to parse the .proto file, and then a
+python script to generate the C header and source code from it::
  
      user@host:~$ protoc -omessage.pb message.proto
      user@host:~$ python ../generator/nanopb_generator.py message.pb
      Writing to message.h and message.c
      user@host:~$
  
-Compiling .proto files with nanopb options
-------------------------------------------
-Nanopb defines two extensions for message fields described in .proto files: *max_size* and *max_count*.
-These are the maximum size of a string and maximum count of items in an array::
+Modifying generator behaviour
+-----------------------------
+Using generator options, you can set maximum sizes for fields in order to
+allocate them statically. The preferred way to do this is to create an .options
+file with the same name as your .proto file::
  
-    required string name = 1 [(nanopb).max_size = 40];
-    repeated PhoneNumber phone = 4 [(nanopb).max_count = 5];
+   # Foo.proto
+   message Foo {
+      required string name = 1;
+   }
  
-To use these extensions, you need to place an import statement in the beginning of the file::
+::
  
-    import "nanopb.proto";
+   # Foo.options
+   Foo.name max_size:16
  
-This file, in turn, requires the file *google/protobuf/descriptor.proto*. This is usually installed under */usr/include*. Therefore, to compile a .proto file which uses options, use a protoc command similar to::
+For more information on this, see the `Proto file options`_ section in the
+reference manual.
  
-    protoc -I/usr/include -Inanopb/generator -I. -omessage.pb message.proto
+.. _`Proto file options`: reference.html#proto-file-options
  
  Streams
  =======
@@ -50,6 +57,7 @@ There are a few generic rules for callback functions:
  #) Use state to store your own data, such as a file descriptor.
  #) *bytes_written* and *bytes_left* are updated by pb_write and pb_read.
  #) Your callback may be used with substreams. In this case *bytes_left*, *bytes_written* and *max_size* have smaller values than the original stream. Don't use these values to calculate pointers.
+#) Always read or write the full requested length of data. For example, POSIX *recv()* needs the *MSG_WAITALL* parameter to accomplish this.
  
  Output streams
  --------------
@@ -91,9 +99,8 @@ Writing to stdout::
  
  Input streams
  -------------
-For input streams, there are a few extra rules:
+For input streams, there is one extra rule:
  
-#) If buf is NULL, read from stream but don't store the data. This is used to skip unknown input.
  #) You don't need to know the length of the message in advance. After getting EOF error when reading, set bytes_left to 0 and return false. Pb_decode will detect this and if the EOF was in a proper position, it will return true.
  
  Here is the structure::
@@ -167,7 +174,9 @@ Field callbacks
  ===============
  When a field has dynamic length, nanopb cannot statically allocate storage for it. Instead, it allows you to handle the field in whatever way you want, using a callback function.
  
-The `pb_callback_t`_ structure contains a function pointer and a *void* pointer you can use for passing data to the callback. If the function pointer is NULL, the field will be skipped. The actual behavior of the callback function is different in encoding and decoding modes.
+The `pb_callback_t`_ structure contains a function pointer and a *void* pointer called *arg* you can use for passing data to the callback. If the function pointer is NULL, the field will be skipped. A pointer to the *arg* is passed to the function, so that it can modify it and retrieve the value.
+
+The actual behavior of the callback function is different in encoding and decoding modes. In encoding mode, the callback is called once and should write out everything, including field tags. In decoding mode, the callback is called repeatedly for every data item.
  
  .. _`pb_callback_t`: reference.html#pb-callback-t
  
@@ -175,7 +184,7 @@ Encoding callbacks
  ------------------
  ::
  
-    bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, const void *arg);
+    bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, void * const *arg);
  
  When encoding, the callback should write out complete fields, including the wire type and field number tag. It can write as many or as few fields as it likes. For example, if you want to write out an array as *repeated* field, you should do it all in a single call.
  
@@ -189,7 +198,7 @@ If the callback is used in a submessage, it will be called multiple times during
  
  This callback writes out a dynamically sized string::
  
-    bool write_string(pb_ostream_t *stream, const pb_field_t *field, const void *arg)
+    bool write_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg)
      {
          char *str = get_string_from_somewhere();
          if (!pb_encode_tag_for_field(stream, field))
@@ -202,7 +211,7 @@ Decoding callbacks
  ------------------
  ::
  
-    bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void *arg);
+    bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void **arg);
  
  When decoding, the callback receives a length-limited substring that reads the contents of a single field. The field tag has already been read. For *string* and *bytes*, the length value has already been parsed, and is available at *stream->bytes_left*.
  
@@ -212,7 +221,7 @@ The callback will be called multiple times for repeated fields. For packed field
  
  This callback reads multiple integers and prints them::
  
-    bool read_ints(pb_istream_t *stream, const pb_field_t *field, void *arg)
+    bool read_ints(pb_istream_t *stream, const pb_field_t *field, void **arg)
      {
          while (stream->bytes_left)
          {
@@ -241,19 +250,129 @@ For example this submessage in the Person.proto file::
  generates this field description array for the structure *Person_PhoneNumber*::
  
   const pb_field_t Person_PhoneNumber_fields[3] = {
-    {1, PB_HTYPE_REQUIRED | PB_LTYPE_STRING,
-    offsetof(Person_PhoneNumber, number), 0,
-    pb_membersize(Person_PhoneNumber, number), 0, 0},
-
-    {2, PB_HTYPE_OPTIONAL | PB_LTYPE_VARINT,
-    pb_delta(Person_PhoneNumber, type, number),
-    pb_delta(Person_PhoneNumber, has_type, type),
-    pb_membersize(Person_PhoneNumber, type), 0,
-    &Person_PhoneNumber_type_default},
-
+    PB_FIELD(  1, STRING  , REQUIRED, STATIC, Person_PhoneNumber, number, number, 0),
+    PB_FIELD(  2, ENUM    , OPTIONAL, STATIC, Person_PhoneNumber, type, number, &Person_PhoneNumber_type_default),
      PB_LAST_FIELD
   };
  
+Oneof
+=====
+Protocol Buffers supports `oneof`_ sections. Here is an example of ``oneof`` usage::
+
+ message MsgType1 {
+     required int32 value = 1;
+ }
+
+ message MsgType2 {
+     required bool value = 1;
+ }
+ 
+ message MsgType3 {
+     required int32 value1 = 1;
+     required int32 value2 = 2;
+ } 
+ 
+ message MyMessage {
+     required uint32 uid = 1;
+     required uint32 pid = 2;
+     required uint32 utime = 3;
+ 
+     oneof payload {
+         MsgType1 msg1 = 4;
+         MsgType2 msg2 = 5;
+         MsgType3 msg3 = 6;
+     }
+ }
+
+Nanopb will generate ``payload`` as a C union and add an additional field ``which_payload``::
+
+  typedef struct _MyMessage {
+    uint32_t uid;
+    uint32_t pid;
+    uint32_t utime;
+    pb_size_t which_payload;
+    union {
+        MsgType1 msg1;
+        MsgType2 msg2;
+        MsgType3 msg3;
+    } payload;
+  /* @@protoc_insertion_point(struct:MyMessage) */
+  } MyMessage;
+
+``which_payload`` indicates which of the ``oneof`` fields is actually set. 
+The user is expected to set the filed manually using the correct field tag::
+
+  MyMessage msg = MyMessage_init_zero;
+  msg.payload.msg2.value = true;
+  msg.which_payload = MyMessage_msg2_tag;
+
+Notice that neither ``which_payload`` field nor the unused fileds in ``payload``
+will consume any space in the resulting encoded message.
+
+.. _`oneof`: https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#oneof_and_oneof_field
+
+Extension fields
+================
+Protocol Buffers supports a concept of `extension fields`_, which are
+additional fields to a message, but defined outside the actual message.
+The definition can even be in a completely separate .proto file.
+
+The base message is declared as extensible by keyword *extensions* in
+the .proto file::
+
+ message MyMessage {
+     .. fields ..
+     extensions 100 to 199;
+ }
+
+For each extensible message, *nanopb_generator.py* declares an additional
+callback field called *extensions*. The field and associated datatype
+*pb_extension_t* forms a linked list of handlers. When an unknown field is
+encountered, the decoder calls each handler in turn until either one of them
+handles the field, or the list is exhausted.
+
+The actual extensions are declared using the *extend* keyword in the .proto,
+and are in the global namespace::
+
+ extend MyMessage {
+     optional int32 myextension = 100;
+ }
+
+For each extension, *nanopb_generator.py* creates a constant of type
+*pb_extension_type_t*. To link together the base message and the extension,
+you have to:
+
+1. Allocate storage for your field, matching the datatype in the .proto.
+   For example, for a *int32* field, you need a *int32_t* variable to store
+   the value.
+2. Create a *pb_extension_t* constant, with pointers to your variable and
+   to the generated *pb_extension_type_t*.
+3. Set the *message.extensions* pointer to point to the *pb_extension_t*.
+
+An example of this is available in *tests/test_encode_extensions.c* and
+*tests/test_decode_extensions.c*.
+
+.. _`extension fields`: https://developers.google.com/protocol-buffers/docs/proto#extensions
+
+Message framing
+===============
+Protocol Buffers does not specify a method of framing the messages for transmission.
+This is something that must be provided by the library user, as there is no one-size-fits-all
+solution. Typical needs for a framing format are to:
+
+1. Encode the message length.
+2. Encode the message type.
+3. Perform any synchronization and error checking that may be needed depending on application.
+
+For example UDP packets already fullfill all the requirements, and TCP streams typically only
+need a way to identify the message length and type. Lower level interfaces such as serial ports
+may need a more robust frame format, such as HDLC (high-level data link control).
+
+Nanopb provides a few helpers to facilitate implementing framing formats:
+
+1. Functions *pb_encode_delimited* and *pb_decode_delimited* prefix the message data with a varint-encoded length.
+2. Union messages and oneofs are supported in order to implement top-level container messages.
+3. Message IDs can be specified using the *(nanopb_msgopt).msgid* option and can then be accessed from the header.
  
  Return values and error handling
  ================================
@@ -262,8 +381,8 @@ Most functions in nanopb return bool: *true* means success, *false* means failur
  
  The error messages help in guessing what is the underlying cause of the error. The most common error conditions are:
  
-1) Running out of memory. Because everything is allocated from the stack, nanopb can't detect this itself. Encoding or decoding the same type of a message always takes the same amount of stack space. Therefore, if it works once, it works always.
-2) Invalid field description. These are usually stored as constants, so if it works under the debugger, it always does.
+1) Running out of memory, i.e. stack overflow.
+2) Invalid field descriptors (would usually mean a bug in the generator).
  3) IO errors in your own stream callbacks.
  4) Errors that happen in your callback functions.
  5) Exceeding the max_size or bytes_left of a stream.