Skip to content

Commit 096e41e

Browse files
authored
restores the old bitvector order, changes the Primus Value order (#1509)
Restores the old default order for bitvectors that was broken in #1502. Despite what it is said in #1502, the default order of bitvectors was always taking the size into account and ordered `0:0` before `0:1`. It is even clearly stated in the documentation. Therefore, this commit restores the default order (and makes sure that the hash functions are consitent with the order). Also adds a new order, `Signed_value_order` and renames the newly introduced orders with longer names that are more descriptive and has less chances to clash with the existing module names (as the Bitvector module is commonly locally-opened). Finally, changes the default order for `Primus.Value`, which is now `Signed_value_order`. Note that the change of `Primus.Value` order doesn't really affect any Primus or Primus Lisp ordering operations as they are always well-typed and use the the default order of bitvectors. It affects, however dictionaries (and probably other direct uses of the `Primus.Value` module). The underlying reason why this change is necessary is to fix the symbolic executor that stores symbolic channels in a dictionary indexed by file descriptors. We store, the standard 0,1,2 descriptors are prefilled in the init method, and 0,1,2 literals evaluate to the machine word-sized bitvectors, however file descriptors of the `int` type might have a different size, and the lookup for the opened file descriptors fails. An alternative fix would be casting literals to a proper type, but since the current behavior is very unintutitive, it was decided to switch the order and compare Primus Values by values.
1 parent a972f8a commit 096e41e

File tree

4 files changed

+107
-48
lines changed

4 files changed

+107
-48
lines changed

lib/bap/bap.mli

Lines changed: 79 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1143,39 +1143,13 @@ module Std : sig
11431143
suggested to use them, if you know what kind of operands you're
11441144
expecting.
11451145
1146-
{2 Clarification on size-morphism }
1146+
{2:bv_signs Clarification on signs}
11471147
1148-
Size-monomorphic operations (as opposed to size-polymorphic
1149-
comparison) doesn't allow to compare two operands with different
1150-
sizes, and either raise an exception or return [Error]. If we would
1151-
have a type safe interface, with type [t] defined as [type 'a t],
1152-
where ['a] stands for size, then size-monomorphic operations will
1153-
have type ['a t -> 'a t -> _], and size-polymorphic ['a t -> 'b t -> _].
1154-
1155-
By default, size-polymorphic comparison is used. To understand
1156-
the ordering relation one can think that a lexical ordering is
1157-
specified on a tuple [(x,n)], where [x] is the number and [n] is
1158-
the size. For example, the following sequence is in an ascending
1159-
order:
1160-
1161-
{[ 0x0:1, 0x0:32, 0x0:64, 0x1:1, 0x1:32, 0xD:4, 0xDEADBEEF:32]}.
1162-
1163-
A size-monomorphic interfaced is exposed in a [Mono] submodule. So
1164-
if you want a monomorphic map, then just use [Mono.Map] module.
1165-
Note, [Mono] submodule doesn't provide [Table], since we cannot
1166-
guarantee that all keys in a hash-table have equal size.
1167-
1168-
{2 Clarification on signs}
1169-
1170-
By default all numbers represented by a bitvector are considered
1171-
unsigned. This includes comparisons, e.g., [of_int (-1)
1172-
~width:32] is greater than zero. If you need to perform a signed
1173-
operation, you can use the [signed] operator to temporary cast
1174-
your value to the signed kind. We use word "temporary" to
1175-
emphasize that, the signedness property won't propagate to the
1176-
result of the operation, e.g. result of the following
1177-
expression: [Int_exn.(signed x / y)] will not be signed. In other
1178-
words each new value is created unsigned.
1148+
By default, all are numbers represented with bitvectors are
1149+
considered unsigned. This includes the ordering, e.g., [of_int
1150+
(-1) ~width:32] is greater than [of_int 0 ~width:32]. If you
1151+
need to perform a signed operation, you can use the [signed]
1152+
operator create a signed word with the same value.
11791153
11801154
If any operand of a binary operation is signed, then a signed
11811155
version of an operation is used, i.e., the other operand is
@@ -1192,6 +1166,62 @@ module Std : sig
11921166
let q = signed x < zero (* p = true *)
11931167
]}
11941168
1169+
{2:bv_sizes Clarification on size-morphism }
1170+
1171+
Size-monomorphic operations (as opposed to size-polymorphic)
1172+
expect operands of the same size. When applied to operands of
1173+
different sizes they either raise exceptions or return
1174+
an [Error] variant as the result. All arithmetic operations are
1175+
size-monomorphic and we provide interface that use either
1176+
exceptions or [Result.t] to indicate the outcome.
1177+
1178+
The comparison operation is size-polymorphic by default and
1179+
takes the size of the bitvector into account. Bitvectors
1180+
with equal values but different sizes are unequal. The precise
1181+
order matches with the order of pairs, where the first
1182+
constituent is the bitvector value, and the second is its size,
1183+
for example, the following sequence is in an ascending order:
1184+
1185+
{[ 0x0:1, 0x0:32, 0x0:64, 0x1:1, 0x1:32, 0xD:4, 0xDEADBEEF:32]}.
1186+
1187+
A size-monomorphic interfaced is exposed in a [Mono] submodule. So
1188+
if you want a monomorphic map, then just use [Mono.Map] module.
1189+
Note, [Mono] submodule doesn't provide [Table], since we cannot
1190+
guarantee that all keys in a hash-table have equal size. The
1191+
order functions provided by the Mono module will raise an
1192+
exception when applied to bitvectors with different sizes.
1193+
1194+
In the default and [Mono] orders, if either of two values is
1195+
signed (see {!bv_signs}) then the values will be ordered as
1196+
2-complement signed integers.
1197+
1198+
Another alternative orders are [Signed_value_order],
1199+
[Unsigned_value_order], and [Literal_order]. They will be
1200+
briefly described below.
1201+
1202+
[Signed_value_order] is size-polymoprhic and it simply
1203+
ignores the sizes of bitvectors and orders them by values, e.g.,
1204+
the following bitvectors are ordered in the [Value.Signed]
1205+
order, [FF:8; 0:1; 0F:8; FF:32], and [0:1] is equal to
1206+
[0:32]. See {!bv_sizes} for more details on the signedness of
1207+
operations. Note, that the size of a word still affects the
1208+
order since it defines the position of the most significant bit.
1209+
1210+
[Unsigned_value_order] ignores the sign and the size of
1211+
words and compares them by the unsigned order of their values.
1212+
he following numbers are ordered with the [Unsigned_value_order]
1213+
order, [0:1, 1:32, 0F:8 FF:8], and [FF:32] is equal to [FF:8].
1214+
[Unsigned_value_order] is faster than then any previously
1215+
described order and is useful when the size of the words should
1216+
be ignored (or is known to be equal and therefore could be
1217+
ignored).
1218+
1219+
[Literal_order] is the fastest order that takes into account
1220+
all constituents of bitvectors, like if we will treat a
1221+
bitvector as triple of its value, size, and sign and order
1222+
bitvectors using the lexicographical order.
1223+
1224+
11951225
{2:bv_string Clarification on string representation }
11961226
11971227
As a part of [Identifiable] interface bitvector provides a pair of
@@ -1238,24 +1268,32 @@ module Std : sig
12381268
(** The comparable interface with size-monomorphic comparison. *)
12391269
module Mono : Comparable with type t := t
12401270

1271+
(** Compare by value, ignore size, but take into account the sign.
12411272
1242-
(** The comparable interface using the unsigned order.
1243-
1244-
@since 2.5.0 *)
1245-
module Unsigned : sig
1273+
See {!bv_sizes} for more information.
1274+
@since 2.5.0
1275+
*)
1276+
module Signed_value_order : sig
12461277
include Binable.S with type t = t
12471278
include Comparable.S_binable with type t := t
12481279
include Hashable.S_binable with type t := t
12491280
end
12501281

1251-
(** The comparable interface using the literal order.
1282+
(** Compare by value, ignore both the size and the sign.
1283+
1284+
See {!bv_sizes} for more information.
1285+
@since 2.5.0 *)
1286+
module Unsigned_value_order : sig
1287+
include Binable.S with type t = t
1288+
include Comparable.S_binable with type t := t
1289+
include Hashable.S_binable with type t := t
1290+
end
12521291

1253-
In this order the bitvectors are compared literally, so that
1254-
bitvectors of different sizes but with equal values will be
1255-
different. This is the fastest order.
1292+
(** The lexicographical order of (value,size,sign) triples.
12561293
1294+
See {!bv_sizes} for more information.
12571295
@since 2.5.0 *)
1258-
module Literal : sig
1296+
module Literal_order : sig
12591297
include Binable.S with type t = t
12601298
include Comparable.S_binable with type t := t
12611299
include Hashable.S_binable with type t := t

lib/bap_primus/bap_primus_value.ml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ end
1717
type id = Id.t [@@deriving bin_io, compare, sexp]
1818

1919

20-
let compare_value x y = Word.compare x.value y.value
20+
let compare_value x y = Word.Signed_value_order.compare x.value y.value
2121
type t = value [@@deriving bin_io, compare]
2222

2323

lib/bap_types/bap_bitvector.ml

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -634,13 +634,13 @@ module Trie = struct
634634
end
635635
end
636636

637-
module Literal = struct
637+
module Literal_order = struct
638638
type t = packed [@@deriving bin_io, sexp]
639639
include Comparable.Make_binable(Packed)
640640
include Hashable.Make_binable_and_derive_hash_fold_t(Packed)
641641
end
642642

643-
module Unsigned = struct
643+
module Unsigned_value_order = struct
644644
module Order = struct
645645
type t = packed [@@deriving bin_io,sexp]
646646
let compare = compare_unsigned
@@ -651,14 +651,30 @@ module Unsigned = struct
651651
include Order
652652
end
653653

654+
module Signed_value_order = struct
655+
module Order = struct
656+
type t = packed [@@deriving bin_io,sexp]
657+
let compare = compare_signed
658+
let hash = hash
659+
end
660+
include Comparable.Make_binable(Order)
661+
include Hashable.Make_binable_and_derive_hash_fold_t(Order)
662+
include Order
663+
end
664+
654665
include Or_error.Monad_infix
655666
include Regular.Make(struct
656667
type t = packed [@@deriving bin_io, sexp]
657-
let compare = compare_signed
668+
let compare x y =
669+
if phys_equal x y then 0
670+
else match Int.compare (bitwidth x) (bitwidth y) with
671+
| 0 -> compare_signed x y
672+
| r -> r
673+
[@@inline]
658674
let version = "2.0.0"
659675
let module_name = Some "Bap.Std.Word"
660676
let pp ppf = pp_generic ppf
661-
let hash = Packed.hash
677+
let hash x = Int.hash (bitwidth x) lxor hash x
662678
end)
663679
module Int_err = Safe
664680
include (Unsafe : Bap_integer.S with type t := t)

lib/bap_types/bap_bitvector.mli

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,17 @@ type endian =
1313
include Regular.S with type t := t
1414
include Bap_integer.S with type t := t
1515
module Mono : Comparable.S with type t := t
16-
module Unsigned : sig
16+
module Signed_value_order : sig
1717
include Binable.S with type t = t
1818
include Comparable.S_binable with type t := t
1919
include Hashable.S_binable with type t := t
2020
end
21-
module Literal : sig
21+
module Unsigned_value_order : sig
22+
include Binable.S with type t = t
23+
include Comparable.S_binable with type t := t
24+
include Hashable.S_binable with type t := t
25+
end
26+
module Literal_order : sig
2227
include Binable.S with type t = t
2328
include Comparable.S_binable with type t := t
2429
include Hashable.S_binable with type t := t

0 commit comments

Comments
 (0)