PE module

The PE module allows you to create more fine-grained rules for PE files by using attributes and features of the PE file format. This module exposes most of the fields present in a PE header and provides functions which can be used to write more expressive and targeted rules. Let's see some examples:

import "pe"

rule single_section
{
    condition:
        pe.number_of_sections == 1
}

rule control_panel_applet
{
    condition:
        pe.exports("CPlApplet")
}

rule is_dll
{
    condition:
        pe.characteristics & pe.DLL
}

rule is_pe
{
    condition:
        pe.is_pe
}

Reference

machine

Changed in version 3.3.0.

Integer with one of the following values:

MACHINE_UNKNOWN
MACHINE_AM33
MACHINE_AMD64
MACHINE_ARM
MACHINE_ARMNT
MACHINE_ARM64
MACHINE_EBC
MACHINE_I386
MACHINE_IA64
MACHINE_M32R
MACHINE_MIPS16
MACHINE_MIPSFPU
MACHINE_MIPSFPU16
MACHINE_POWERPC
MACHINE_POWERPCFP
MACHINE_R4000
MACHINE_SH3
MACHINE_SH3DSP
MACHINE_SH4
MACHINE_SH5
MACHINE_THUMB
MACHINE_WCEMIPSV2

Example: pe.machine == pe.MACHINE_AMD64

checksum

New in version 3.6.0.

Integer with the "PE checksum" as stored in the OptionalHeader

calculate_checksum

New in version 3.6.0.

Function that calculates the "PE checksum"

Example: pe.checksum == pe.calculate_checksum()

subsystem

Integer with one of the following values:

SUBSYSTEM_UNKNOWN
SUBSYSTEM_NATIVE
SUBSYSTEM_WINDOWS_GUI
SUBSYSTEM_WINDOWS_CUI
SUBSYSTEM_OS2_CUI
SUBSYSTEM_POSIX_CUI
SUBSYSTEM_NATIVE_WINDOWS
SUBSYSTEM_WINDOWS_CE_GUI
SUBSYSTEM_EFI_APPLICATION
SUBSYSTEM_EFI_BOOT_SERVICE_DRIVER
SUBSYSTEM_EFI_RUNTIME_DRIVER
SUBSYSTEM_XBOX
SUBSYSTEM_WINDOWS_BOOT_APPLICATION

Example: pe.subsystem == pe.SUBSYSTEM_NATIVE

timestamp

PE timestamp.

pointer_to_symbol_table

New in version 3.8.0.

Value of IMAGE_FILE_HEADER::PointerToSymbolTable. Used when the PE image has COFF debug info.

number_of_symbols

New in version 3.8.0.

Value of IMAGE_FILE_HEADER::NumberOfSymbols. Used when the PE image has COFF debug info.

size_of_optional_header

New in version 3.8.0.

Value of IMAGE_FILE_HEADER::SizeOfOptionalHeader. This is real size of the optional header and reflects differences between 32-bit and 64-bit optional header and number of data directories.

opthdr_magic

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::Magic.

size_of_code

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfCode. This is the sum of raw data sizes in code sections.

size_of_initialized_data

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfInitializedData.

size_of_uninitialized_data

Value of IMAGE_OPTIONAL_HEADER::SizeOfUninitializedData.

entry_point

Entry point raw offset or virtual address depending on whether YARA is scanning a file or process memory respectively. This is equivalent to the deprecated entrypoint keyword.

base_of_code

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::BaseOfCode.

base_of_data

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::BaseOfData. This field only exists in 32-bit PE files.

image_base

Image base relative virtual address.

section_alignment

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SectionAlignment. When Windows maps a PE image to memory, all raw sizes (including size of header) are aligned up to this value.

file_alignment

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::FileAlignment. All raw data sizes of sections in the PE image are aligned to this value.

win32_version_value

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::Win32VersionValue.

size_of_image

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfImage. This is the total virtual size of header and all sections.

size_of_headers

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfHeaders. This is the raw data size of the PE headers including DOS header, file header, optional header and all section headers. When PE is mapped to memory, this value is subject to aligning up to SectionAlignment.

characteristics

Bitmap with PE FileHeader characteristics. Individual characteristics can be inspected by performing a bitwise AND operation with the following constants:

RELOCS_STRIPPED

Relocation info stripped from file.

EXECUTABLE_IMAGE

File is executable (i.e. no unresolved external references).

LINE_NUMS_STRIPPED

Line numbers stripped from file.

LOCAL_SYMS_STRIPPED

Local symbols stripped from file.

AGGRESIVE_WS_TRIM

Aggressively trim working set

LARGE_ADDRESS_AWARE

App can handle >2gb addresses

BYTES_REVERSED_LO

Bytes of machine word are reversed.

MACHINE_32BIT

32 bit word machine.

DEBUG_STRIPPED

Debugging info stripped from file in .DBG file

REMOVABLE_RUN_FROM_SWAP

If Image is on removable media, copy and run from the swap file.

NET_RUN_FROM_SWAP

If Image is on Net, copy and run from the swap file.

SYSTEM

System File.

DLL

File is a DLL.

UP_SYSTEM_ONLY

File should only be run on a UP machine

BYTES_REVERSED_HI

Bytes of machine word are reversed.

Example: pe.characteristics & pe.DLL

linker_version

An object with two integer attributes, one for each major and minor linker version.

major

Major linker version.

minor

Minor linker version.

os_version

An object with two integer attributes, one for each major and minor OS version.

major

Major OS version.

minor

Minor OS version.

image_version

An object with two integer attributes, one for each major and minor image version.

major

Major image version.

minor

Minor image version.

subsystem_version

An object with two integer attributes, one for each major and minor subsystem version.

major

Major subsystem version.

minor

Minor subsystem version.

dll_characteristics

Bitmap with PE OptionalHeader DllCharacteristics. Do not confuse these flags with the PE FileHeader Characteristics. Individual characteristics can be inspected by performing a bitwise AND operation with the following constants:

DYNAMIC_BASE

File can be relocated - also marks the file as ASLR compatible

FORCE_INTEGRITY
NX_COMPAT

Marks the file as DEP compatible

NO_ISOLATION
NO_SEH

The file does not contain structured exception handlers, this must be set to use SafeSEH

NO_BIND
WDM_DRIVER

Marks the file as a Windows Driver Model (WDM) device driver.

TERMINAL_SERVER_AWARE

Marks the file as terminal server compatible

size_of_stack_reserve

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfStackReserve. This is the default amount of virtual memory that will be reserved for stack.

size_of_stack_commit

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfStackCommit. This is the default amount of virtual memory that will be allocated for stack.

size_of_heap_reserve

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfHeapReserve. This is the default amount of virtual memory that will be reserved for main process heap.

size_of_heap_commit

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::SizeOfHeapCommit. This is the default amount of virtual memory that will be allocated for main process heap.

loader_flags

New in version 3.8.0.

Value of IMAGE_OPTIONAL_HEADER::LoaderFlags.

number_of_rva_and_sizes

Value of IMAGE_OPTIONAL_HEADER::NumberOfRvaAndSizes. This is the number of items in the IMAGE_OPTIONAL_HEADER::DataDirectory array.

data_directories

New in version 3.8.0.

A zero-based array of data directories. Each data directory contains virtual address and length of the appropriate data directory. Each data directory has the following entries:

virtual_address

Relative virtual address (RVA) of the PE data directory. If this is zero, then the data directory is missing. Note that for digital signature, this is the file offset, not RVA.

size

Size of the PE data directory, in bytes.

The index for the data directory entry can be one of the following values:

IMAGE_DIRECTORY_ENTRY_EXPORT

Data directory for exported functions.

IMAGE_DIRECTORY_ENTRY_IMPORT

Data directory for import directory.

IMAGE_DIRECTORY_ENTRY_RESOURCE

Data directory for resource section.

IMAGE_DIRECTORY_ENTRY_EXCEPTION

Data directory for exception information.

IMAGE_DIRECTORY_ENTRY_SECURITY

This is the raw file offset and length of the image digital signature. If the image has no embedded digital signature, this directory will contain zeros.

IMAGE_DIRECTORY_ENTRY_BASERELOC

Data directory for image relocation table.

IMAGE_DIRECTORY_ENTRY_DEBUG

Data directory for debug information.

IMAGE_DIRECTORY_ENTRY_TLS

Data directory for image thread local storage.

IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG

Data directory for image load configuration.

IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT

Data directory for image bound import table.

IMAGE_DIRECTORY_ENTRY_IAT

Data directory for image Import Address Table.

IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT

Data directory for Delayed Import Table. Structure of the delayed import table is linker-dependent. Microsoft version of delayed imports is described in the souces "delayimp.h" and "delayimp.cpp", which can be found in MS Visual Studio 2008 CRT sources.

IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR

Data directory for .NET headers.

Example: pe.data_directories[pe.IMAGE_DIRECTORY_ENTRY_EXPORT].virtual_address != 0

number_of_sections

Number of sections in the PE.

sections

New in version 3.3.0.

A zero-based array of section objects, one for each section the PE has. Individual sections can be accessed by using the [] operator. Each section object has the following attributes:

name

Section name.

characteristics

Section characteristics.

virtual_address

Section virtual address.

virtual_size

Section virtual size.

raw_data_offset

Section raw offset.

raw_data_size

Section raw size.

pointer_to_relocations

New in version 3.8.0.

Value of IMAGE_SECTION_HEADER::PointerToRelocations.

pointer_to_line_numbers

New in version 3.8.0.

Value of IMAGE_SECTION_HEADER::PointerToLinenumbers.

number_of_relocations

New in version 3.8.0.

Value of IMAGE_SECTION_HEADER::NumberOfRelocations.

number_of_line_numbers

New in version 3.8.0.

Value of IMAGE_SECTION_HEADER::NumberOfLineNumbers.

Example: pe.sections[0].name == ".text"

Individual section characteristics can be inspected using a bitwise AND operation with the following constants:

SECTION_CNT_CODE
SECTION_CNT_INITIALIZED_DATA
SECTION_CNT_UNINITIALIZED_DATA
SECTION_GPREL
SECTION_MEM_16BIT
SECTION_LNK_NRELOC_OVFL
SECTION_MEM_DISCARDABLE
SECTION_MEM_NOT_CACHED
SECTION_MEM_NOT_PAGED
SECTION_MEM_SHARED
SECTION_MEM_EXECUTE
SECTION_MEM_READ
SECTION_MEM_WRITE

Example: pe.sections[1].characteristics & pe.SECTION_CNT_CODE

overlay

New in version 3.6.0.

A structure containing the following integer members:

offset

Overlay section offset.

size

Overlay section size.

Example: uint8(0x0d) at pe.overlay.offset and pe.overlay.size > 1024

number_of_resources

Number of resources in the PE.

resource_timestamp

Resource timestamp. This is stored as an integer.

resource_version

An object with two integer attributes, major and minor versions.

major

Major resource version.

minor

Minor resource version.

resources

Changed in version 3.3.0.

A zero-based array of resource objects, one for each resource the PE has. Individual resources can be accessed by using the [] operator. Each resource object has the following attributes:

offset

Offset for the resource data.

length

Length of the resource data.

type

Type of the resource (integer).

id

ID of the resource (integer).

language

Language of the resource (integer).

type_string

Type of the resource as a string, if specified.

name_string

Name of the resource as a string, if specified.

language_string

Language of the resource as a string, if specified.

All resources must have a type, id (name), and language specified. They can be either an integer or string, but never both, for any given level.

Example: pe.resources[0].type == pe.RESOURCE_TYPE_RCDATA

Example: pe.resources[0].name_string == "F\x00I\x00L\x00E\x00"

Resource types can be inspected using the following constants:

RESOURCE_TYPE_CURSOR
RESOURCE_TYPE_BITMAP
RESOURCE_TYPE_ICON
RESOURCE_TYPE_MENU
RESOURCE_TYPE_DIALOG
RESOURCE_TYPE_STRING
RESOURCE_TYPE_FONTDIR
RESOURCE_TYPE_FONT
RESOURCE_TYPE_ACCELERATOR
RESOURCE_TYPE_RCDATA
RESOURCE_TYPE_MESSAGETABLE
RESOURCE_TYPE_GROUP_CURSOR
RESOURCE_TYPE_GROUP_ICON
RESOURCE_TYPE_VERSION
RESOURCE_TYPE_DLGINCLUDE
RESOURCE_TYPE_PLUGPLAY
RESOURCE_TYPE_VXD
RESOURCE_TYPE_ANICURSOR
RESOURCE_TYPE_ANIICON
RESOURCE_TYPE_HTML
RESOURCE_TYPE_MANIFEST

For more information refer to:

http://msdn.microsoft.com/en-us/library/ms648009(v=vs.85).aspx

version_info

New in version 3.2.0.

Dictionary containing the PE's version information. Typical keys are:

Comments CompanyName FileDescription FileVersion InternalName LegalCopyright LegalTrademarks OriginalFilename ProductName ProductVersion

For more information refer to:

http://msdn.microsoft.com/en-us/library/windows/desktop/ms646987(v=vs.85).aspx

Example: pe.version_info["CompanyName"] contains "Microsoft"

number_of_signatures

Number of authenticode signatures in the PE.

signatures

A zero-based array of signature objects, one for each authenticode signature in the PE file. Usually PE files have a single signature.

thumbprint

New in version 3.8.0.

A string containing the thumbprint of the signature.

issuer

A string containing information about the issuer. These are some examples:

"/C=US/ST=Washington/L=Redmond/O=Microsoft Corporation/CN=Microsoft Code Signing PCA"

"/C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=Terms of use at https://www.verisign.com/rpa (c)10/CN=VeriSign Class 3 Code Signing 2010 CA"

"/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO Code Signing CA 2"
subject

A string containing information about the subject.

version

Version number.

algorithm

Algorithm used for this signature. Usually "sha1WithRSAEncryption".

serial

A string containing the serial number. This is an example:

"52:00:e5:aa:25:56:fc:1a:86:ed:96:c9:d4:4b:33:c7"
not_before

Unix timestamp on which the validity period for this signature begins.

not_after

Unix timestamp on which the validity period for this signature ends.

valid_on(timestamp)

Function returning true if the signature was valid on the date indicated by timestamp. The following sentence:

pe.signatures[n].valid_on(timestamp)

Is equivalent to:

timestamp >= pe.signatures[n].not_before and timestamp <= pe.signatures[n].not_after
rich_signature

Structure containing information about the PE's rich signature as documented here.

offset

Offset where the rich signature starts. It will be undefined if the file doesn't have a rich signature.

length

Length of the rich signature, not including the final "Rich" marker.

key

Key used to encrypt the data with XOR.

raw_data

Raw data as it appears in the file.

clear_data

Data after being decrypted by XORing it with the key.

version(version, [toolid])

New in version 3.5.0.

Function returning a sum of count values of all matching version records. Provide the optional toolid argument to only match when both match for one entry. More information can be found here:

http://www.ntcore.com/files/richsign.htm

Note: Prior to version 3.11.0, this function returns only a boolean value (0 or 1) if the given version and optional toolid is present in an entry.

Example: pe.rich_signature.version(24215, 261) == 61

toolid(toolid, [version])

New in version 3.5.0.

Function returning a sum of count values of all matching toolid records. Provide the optional version argument to only match when both match for one entry. More information can be found here:

http://www.ntcore.com/files/richsign.htm

Note: Prior to version 3.11.0, this function returns only a boolean value (0 or 1) if the given toolid and optional version is present in an entry.

Example: pe.rich_signature.toolid(170, 40219) >= 99

pdb_path

New in version 4.0.0.

Path of the PDB file for this PE if present.

  • Example: pe.pdb_path == "D:\workspace\2018_R9_RelBld\target\checkout\custprof\Release\custprof.pdb"
exports(function_name)

Function returning true if the PE exports function_name or false otherwise.

Example: pe.exports("CPlApplet")

exports(ordinal)

New in version 3.6.0.

Function returning true if the PE exports ordinal or false otherwise.

Example: pe.exports(72)

exports(/regular_expression/)

New in version 3.7.1.

Function returning true if the PE exports regular_expression or false otherwise.

Example: pe.exports(/^AXS@@/)

exports_index(function_name)

New in version 4.0.0.

Function returning the index into the export_details array where the named function is, undefined otherwise.

Example: pe.exports_index("CPlApplet")

exports_index(ordinal)

New in version 4.0.0.

Function returning the index into the export_details array where the exported ordinal is, undefined otherwise.

Example: pe.exports_index(72)

exports_index(/regular_expression/)

New in version 4.0.0.

Function returning the first index into the export_details array where the regular expression matches the exported name, undefined otherwise.

Example: pe.exports_index(/^ERS@@/)

number_of_exports

New in version 3.6.0.

Number of exports in the PE.

export_details

New in version 4.0.0.

Array of structures containing information about the PE's exports.

offset

Offset where the exported function starts.

name

Name of the exported function. It will be undefined if the function has no name.

forward_name

The name of the function where this export forwards to. It will be undefined if the export is not a forwarding export.

ordinal

The ordinal of the exported function, after the ordinal base has been applied to it.

dll_name

New in version 4.0.0.

The name of the DLL, if it exists in the export directory.

export_timestamp

New in version 4.0.0.

The timestamp the export data was created..

number_of_imports

New in version 3.6.0.

Number of imports in the PE.

imports(dll_name, function_name)

Function returning true if the PE imports function_name from dll_name, or false otherwise. dll_name is case insensitive.

Example: pe.imports("kernel32.dll", "WriteProcessMemory")

imports(dll_name)

New in version 3.5.0.

Changed in version 4.0.0.

Function returning the number of functions from the dll_name, in the PE imports. dll_name is case insensitive.

Note: Prior to version 4.0.0, this function returned only a boolean value indicating if the given DLL name was found in the PE imports. This change is backward compatible, as any number larger than 0 also evaluates as true.

Examples: pe.imports("kernel32.dll"), pe.imports("kernel32.dll") == 10

imports(dll_name, ordinal)

New in version 3.5.0.

Function returning true if the PE imports ordinal from dll_name, or false otherwise. dll_name is case insensitive.

Example: pe.imports("WS2_32.DLL", 3)

imports(dll_regexp, function_regexp)

New in version 3.8.0.

Changed in version 4.0.0.

Function returning the number of functions from the PE imports where a function name matches function_regexp and a DLL name matches dll_regexp. Both dll_regexp and function_regexp are case sensitive unless you use the "/i" modifier in the regexp, as shown in the example below.

Note: Prior to version 4.0.0, this function returned only a boolean value indicating if matching import was found or not. This change is backward compatible, as any number larger than 0 also evaluates as true.

Example: pe.imports(/kernel32.dll/i, /(Read|Write)ProcessMemory/) == 2

locale(locale_identifier)

New in version 3.2.0.

Function returning true if the PE has a resource with the specified locale identifier. Locale identifiers are 16-bit integers and can be found here:

http://msdn.microsoft.com/en-us/library/windows/desktop/dd318693(v=vs.85).aspx

Example: pe.locale(0x0419) // Russian (RU)

language(language_identifier)

New in version 3.2.0.

Function returning true if the PE has a resource with the specified language identifier. Language identifiers are 8-bit integers and can be found here:

http://msdn.microsoft.com/en-us/library/windows/desktop/dd318693(v=vs.85).aspx

Example: pe.language(0x0A) // Spanish

imphash()

New in version 3.2.0.

Function returning the import hash or imphash for the PE. The imphash is a MD5 hash of the PE's import table after some normalization. The imphash for a PE can be also computed with pefile and you can find more information in Mandiant's blog.

Example: pe.imphash() == "b8bb385806b89680e13fc0cf24f4431e"

section_index(name)

Function returning the index into the sections array for the section that has name. name is case sensitive.

Example: pe.section_index(".TEXT")

section_index(addr)

New in version 3.3.0.

Function returning the index into the sections array for the section that has addr. addr can be an offset into the file or a memory address.

Example: pe.section_index(pe.entry_point)

is_pe()

New in version 3.8.0.

Return true if the file is a PE.

Example: pe.is_pe()

is_dll()

New in version 3.5.0.

Function returning true if the PE is a DLL.

Example: pe.is_dll()

is_32bit()

New in version 3.5.0.

Function returning true if the PE is 32bits.

Example: pe.is_32bit()

is_64bit()

New in version 3.5.0.

Function returning true if the PE is 64bits.

Example: pe.is_64bit()

rva_to_offset(addr)

New in version 3.6.0.

Function returning the file offset for RVA addr. Be careful to pass relative addresses here and not absolute addresses, like pe.entry_point when scanning a process.

Example: pe.rva_to_offset(pe.sections[0].virtual_address) == pe.sections[0].raw_data_offset

This example will make sure the offset for the virtual address in the first section equals the file offset for that section.