Package org.apache.uima.cas.impl
Class CasSerializerSupport.CasDocSerializer
java.lang.Object
org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
- Enclosing class:
CasSerializerSupport
Use an inner class to hold the data for serializing a CAS. Each call to serialize() creates its
own instance.
package private to allow a test case to access
not static to share the logger and the initializing values (could be changed)
-
Field Summary
FieldsModifier and TypeFieldDescriptionfinal CASImpl
final IntVector[]
final boolean
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified.final boolean
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and listsfinal boolean
Whether the serializer needs to check for filtered-out types/features.final boolean
final ListUtils
final MarkerImpl
Used to tell if a FS was created before or after mark.final PositiveIntSet
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature.boolean
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back inmap from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace stringfor Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serializationfinal Comparator
<Integer> sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by idfinal TypeSystemImpl
final PositiveIntSet_impl
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. -
Constructor Summary
ConstructorsConstructorDescriptionCasDocSerializer
(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss) CasDocSerializer
(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs) -
Method Summary
Modifier and TypeMethodDescriptionfinal int
classifyType
(int type) Classifies a type.void
encodeFS
(int addr) Encode an individual FS.void
void
getNameSpacePrefix
(String uimaTypeName, String nsUri, int lastDotIndex) int
getSofaAddr
(int sofaNum) TypeImpl[]
getXmiId
(int addr) Get the XMI ID to use for an FS.int
getXmiIdAsInt
(int addr) boolean
isStaticMultiRef
(int featCode) void
Starts serializationvoid
-
Field Details
-
cas
-
tsi
-
visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. - FSs added to this, during "enqueue" phase, prior to encoding uses: - for Arrays and Lists, used to detect multi-refs - for Lists, used to detect loops - during enqueuing phase, prevent multiple enqueuings - during encoding phase, to prevent multiple encodings Public for use by JsonCasSerializer -
multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature. This is also for xmi, to enable adding to "queue" (once) for each FSs of this kind. Used: - limit the number of times this is put onto the queue to 1. - skip encoding of items on "queue" if not in this Set (maybe not needed? 8/2017 mis) - serialize if not in indexed set, dynamic ref == true, and in this set (otherwise serialize only from ref) -
isDynamicMultiRef
public final boolean isDynamicMultiRefSet to true for JSON configuration of using dynamic multi-ref detection for arrays and lists -
previouslySerializedFSs
-
modifiedEmbeddedValueFSs
-
indexedFSs
-
listUtils
-
typeCode2namespaceNames
-
needNameSpaces
public boolean needNameSpaces -
nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string -
nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in -
marker
Used to tell if a FS was created before or after mark. -
isDelta
public final boolean isDeltaWhether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object. -
isFiltering
public final boolean isFilteringWhether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer. -
filterTypeSystem
-
isFormattedOutput
public final boolean isFormattedOutput -
sortFssByType
sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
-
-
Constructor Details
-
Method Details
-
serialize
Starts serialization- Throws:
Exception
- -
-
getSofaAddr
public int getSofaAddr(int sofaNum) - Parameters:
sofaNum
- - starts at 1- Returns:
- the addr of the sofa FS, or 0
-
writeViewsCommons
- Throws:
Exception
-
getSortedUsedTypes
-
encodeIndexed
- Throws:
Exception
-
encodeQueued
- Throws:
Exception
-
encodeFS
Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.- Parameters:
addr
- The address to be encoded.- Throws:
SAXException
- passthruException
-
classifyType
public final int classifyType(int type) Classifies a type. This returns an integer code identifying the type as one of the primitive types, one of the array types, one of the list types, or a generic FS type (anything else).The
LowLevelCAS.ll_getTypeClass(int)
method classifies primitives and array types, but does not have a special classification for list types, which we need for XMI serialization. Therefore, in addition to the type codes defined onLowLevelCAS
, this method can return one of the type codes TYPE_CLASS_INTLIST, TYPE_CLASS_FLOATLIST, TYPE_CLASS_STRINGLIST, or TYPE_CLASS_FSLIST.- Parameters:
type
- the type to classify- Returns:
- one of the TYPE_CLASS codes defined on
LowLevelCAS
or on this interface.
-
getXmiId
Get the XMI ID to use for an FS.- Parameters:
addr
- address of FS- Returns:
- XMI ID. If addr == CASImpl.NULL, returns null
-
getXmiIdAsInt
public int getXmiIdAsInt(int addr) -
getNameSpacePrefix
-
getUniqueString
-
getTypeNameFromXmlElementName
-
isStaticMultiRef
public boolean isStaticMultiRef(int featCode)
-