Exercise log format
INJECT allows for export of logs from an exercise for analysis purposes. The log files use the JSONL format. Bellow is description of the structure and schema of logs.
Structure
- logs/
- team-id/
- uploaded_files/
- files uploaded by team-id and for team-id during the exercise
- llm-evaluations/
- email_suggestions.jsonl
- email_evaluations.jsonl
- free_form_evaluations.jsonl
- inject_states.jsonl
- questionnaire_states.jsonl
- action_logs.jsonl
- milestones.jsonl
- emails.jsonl
- team-id+1/
...
- definition_files/
- files used by the definition
- exercise.jsonl
- teams.jsonl
- instructors.jsonl
- file_infos.jsonl
- exercise_tools.jsonl
- exercise_questionnaires.jsonl
- exercise_milestones.jsonl
- exercise_learning_objectives.jsonl
- exercise_injects.jsonl
- exercise_channels.jsonl
- email_participants.jsonl
- llm_assessments.jsonl
Description and format of individual files
Many of the fields described here do not contain a description. That is because these fields are direct copies of the fields described in the definition format, which can be found here.
Timestamps
All fields with the timestamp type are timestamps in ISO 8601 format.
Common objects
Control
- milestone_condition: string
- activate_milestone: list of strings
- deactivate_milestone: list of strings
- roles: list of strings
Content
- raw: string – raw text, possibly markdown
- rendered: string – raw text converted to html elements
- attachments: list of uuids – list of file info ids which are attached to this content
Confirmation
- text: string
- control: control
Overlay
- duration: optional int
Exercise files
exercise.jsonl
Contains information about the exercise. The object has the following format:
- exercise_id: int – id of the exercise
- definition_id: int – id of the definition this exercise was created from
- states: list of state objects – these objects represent the state of an exercise for the
teams, one for normal exercises, multiple for on-demand exercises
- exercise_state_id: int – id of the exercise state
- status: (
Not Started,Stopped,Running,Finished) – the status of the exercise - start_time: optional timestamp – timestamp when this exercise state was started
- finish_time: optional timestamp – timestamp when this exercise state was finished
- elapsed_s: int – number of in-exercise seconds that elapsed in this exercise state
teams.jsonl
Contains information about all the exercise teams. Each object has the following format:
- team_id: int – id of the team
- name: string – name of the team
- role: string – role of the team, empty string if no role
- exercise_id: int – id of the exercise the team belongs to
- finish_time: timestamp – time when the team reached one of the final milestones
- users: a list of user objects – a list of trainees assigned to this team
- user_id: uuid – id of the user
- username: optional string – username of the user, not present if the logs were anonymized
- achieved_score: int – score achieved by the team during the exercise, includes score from instructor comments
- total_score: int – total achievable score during the exercise, does not include score from instructor comments
- exercise_state_id: int – id of the exercise state for this team
instructors.jsonl
Contains all instructors assigned to the exercise. Each object has the following format:
- user_id: uuid – id of the user
- username: optional string – username of the user, not present if the logs were anonymized
exercise_injects.jsonl
Contains all injects for this exercise. Each object has the following format:
- inject_id: int – id of the inject
- name: string
- display_name: string
- time: int
- delay: int
- organization: string
- type: (
Info,Email) - target: channel object
- channel_id: int – id of the channel
- name: string
- display_name: string
- description: string
- type: (
Info,Email)
- alternatives: list of alternative objects
Info inject alternative
- alternative_id: int – id of the alternative
- name: string
- content: content
- control: control
- confirmation: optional confirmation object
Email inject alternative
- alternative_id: int – id of the alternative
- name: string
- subject: string
- content: content
- control: control
- extra_copies: int
exercise_milestones.jsonl
Contains all milestones for this exercise, which are referenced by milestones in each team. Each object has the following format:
- milestone_id: int – id of the milestone
- name: string
- display_name: string
- description: string
- tags: list of strings
- file_names: list of strings – space separated list of file names
- initial_state: bool
- score: int – score added to the team when this milestone is reached
exercise_tools.jsonl
Contains all tools for this exercise, which are referenced by action logs in each team. Each object has the following format:
- tool_id: int – id of the tool
- name: string
- tooltip_description: string
- default_response: content
- roles: list of strings
- requires_input: bool
- responses: list of response objects
- response_id: int – id of the response
- param: string
- content: content
- control: control
- time: int
- regex: bool
exercise_questionnaires.jsonl
Contains all questionnaires for this exercise. Each object has the following format:
- questionnaire_id: int – id of the questionnaire
- name: string
- content: content
- time: int
- control: control
- repeatable: optional repeatable object
- max_attempts: int
- on_fail: optional control
- questions: list of question objects
- question_id: int – id of the question
- content: content
- type: (
Radio,Free-form,Auto-free-form,Multiple-choice) - details: additional details depending on the question type
Radio question details
- labels: list of strings
- correct: int
- max: int
Free-form question details
- related_milestone_ids: list of int
- assessment_id: optional int – id of the llm assessment
Auto-free-form question details
- correct_answer: string
- correct: optional control
- incorrect: optional control
Multiple-choice question details
- labels: list of strings
- correct: list of ints
- exact_match boolean
exercise_learning_objectives.jsonl
Contains all learning objectives for the exercise. Each object has the following format:
- objective_id: int – id of the learning objective
- name: string
- description: string
- tags: list of strings
- total_score: int – total score achievable by completing all activities in this objective
- activities: list of activity objects
- activity_id: int – id of the activity
- name: string
- description: string
- tags: list of strings
- milestone_ids: list of int
- total_score: int – total score achievable by reaching all milestones linked to this activity
exercise_channels.jsonl
Contains all channels that exist for the exercise. Each object has the following format:
- channel_id: int – id of the channel
- name: string
- display_name: string
- description: string
- type: (
Info,Email,Tool,Form)
email_participants.jsonl
Contains all email participants for this exercise, which are referenced by email threads and emails in each team. Each object has the following format:
- participant_id: int – id of the email participant
- address: string
- exercise_id: int – id of the exercise this participant belongs to
- team_id: optional int – team which this participant represents, null if it does not belong to any team
- definition_address: optional definition address object – email address from the definition
which this participant represents, null if it belongs to a team, definition address format:
- email_address_id: int – id of the definition address
- address: string
- description: string
- team_visible: boolean
- organization: string
- control: control
- assessment_id: optional int – id of the llm assessment
file_infos.jsonl
Contains all file infos for this exercise. Each object has the following format:
- file_id: uuid – id of the file, this is also the name of the file on the file system
- file_name: string – original name of the file
- uploaded_by_id: optional uuid – uuid of the user that uploaded this file, null if the file was not uploaded
- uploaded_at: optional timestamp – timestamp when this file was uploaded, null if the file was not uploaded
llm_assessments.jsonl
Contains all LLM assessments for this exercise. Each object has the following format:
- assessment_id: int – id of the assessment
- persona: string
- assessment: string
Individual team files
inject_states.jsonl
Contains all the inject states for the team. Each object has the following format:
- team_id: int – id of the team
- inject_id: int – id of the inject
- status: (
Unsent,Delayed,Sent) – status of the inject - alternative_id: optional int – id of the sent alternative
questionnaire_states.jsonl
Contains all the questionnaire states for the team. Each object has the following format:
- questionnaire_state_id: int – id of the questionnaire state
- questionnaire_id: int – id of the questionnaire
- team_id: int – id of the team
- status: (
Unsent,Sent,Answered,Reviewed) – status of the questionnaire - timestamp_sent: optional timestamp
- timestamp_answered: optional timestamp
- timestamp_reviewed: optional timestamp
- reviewed_by_id: optional uuid – id of the instructor that reviewed the questionnaire
- submissions: list of submission objects
- answers: list of answer objects
- question_id: int – id of the question
- answer: list of strings – submitted answers for this question
- correct: (
Correct,Incorrect,Partially Correct,Unknown) – correctness of the question as determined by the backend - attempt: int – attempt number of this answer
- attempt: int – attempt number of this submission
- accepted: boolean – flag that determines whether this submission was accepted by the platform, controlled by the repeatable field on the questionnaire
- correct: (
Correct,Incorrect,Partially Correct,Unknown) - correctness of the whole submission, excluding questions which cannot be automatically evaluated
- answers: list of answer objects
milestones.jsonl
Contains all milestone states for the team. Each object has the following format:
- milestone_state_id: int – id of the milestone state
- milestone_id: int – id of the milestone this state belongs to, the referenced milestone can be found in exercise_milestones.jsonl
- reached: bool – state of the milestone
- timestamp_reached: optional timestamp – time of the last state change for this milestone
emails.jsonl
Optional file, included only if the email feature is enabled. Contains all email threads and emails for the team. Each object has the following format:
- thread_id: int – id of the email thread
- subject: string
- timestamp: timestamp – time when this email thread was created
- participants: list of ints – list of email participant ids that belong to the thread, the referenced participants can be found in email_participants.jsonl
- emails: list of email object – list of emails sent to this thread, email object format:
- email_id: int – id of the email
- thread_id: int – id of the thread this email was sent to
- sender_id: int – id of the email participant that sent the email
- timestamp: timestamp – time when this email was sent
- content: content
action_logs.jsonl
Contains all the action logs for the team. Each object has the following format:
- action_log_id: int – id of the action log
- type: (
Inject,Custom Inject,Tool,Email,Form,Form Submission,Form Review,Confirmation,File Download,Milestone Modification,Sandbox Log) – type of the action log - timestamp: timestamp – time when this action log was created
- channel_id: int – id of the channel this action log was sent to
- instructor_comment: optional instructor comment
- comment: string – the content of the comment
- score: int – score value assigned to the comment, this value is added to the team's achieved score, can be negative
- created_by_id: optional uuid – uuid of the instructor that created this comment
- created_at: timestamp – timestamp when this comment was created
- edited_by_id: optional uuid – uuid of the instructor that edited this comment
- edited_at: optional timestamp – timestamp when this comment was edited
- previous_log_id: optional int – optional id of an action log that is connected to this action log
- user_id: optional uuid – optional uuid of the user that performed this action, null for automatic actions performed by the platform
- in_exercise_time: int – the in-exercise-time value when this action happened
- details: additional details depending on the action log type
Inject details
- inject_id: int – id of the inject
- content: content
- confirmation: optional confirmation object
- overlay: optional overlay
Custom Inject details
- content: content
- overlay: optional overlay
Tool details
- tool_id: int – id of the used tool
- argument: string – argument provided to the tool
- content: content
- selected_response_id: optional int – id of the selected tool response
Email details
Same fields as described in emails.jsonl in the emails field.
Form details
- questionnaire_state_id: int – id of the sent questionnaire state
Form Submission details
Same fields as described in questionnaire_states.jsonl in the submissions field.
Form Review details
This object currently contains no additional fields.
Confirmation details
This object currently contains no additional fields.
File Download details
- file_info_id: uuid – uuid of the file info that was downloaded
Milestone Modification details
- activated_milestone_states: list of ints – ids of the milestone states that were activated
- deactivated_milestone_states: list of ints – ids of the milestone states that were deactivated
- cause: (0, 2, 4) – the cause for this milestone modification, 0 for trainee action, 2 for instructor action, 4 for automatic action
Sandbox Log details
- cmd: string – the executed command
- cmd_source: string – the source of the log within the container (i.e., Filebeat)
- working_directory: string – the directory within the container in which the command was executed
- username: string – the user by whom the command was executed within the container
- container: string – the name of the container in which the command was executed
email_suggestions.jsonl
Contains all email suggestions generated for this team. Each object has the following format:
- suggestion_id: int – id of the suggestion
- thread_id: int – id of the thread
- trigger_email_id: int – id of the email that the suggestion is responding to
- email_participant_id: int – id of the definition participant the LLM suggests to respond as
- response: string – the suggested text
- created_at: timestamp – time when this suggestion was created
email_evaluations.jsonl
Contains all email evaluations generated for this team. Each object has the following format:
- evaluation_id: int – id of the email evaluation
- action_log_id: int – id of the email action log
- assessment_id: int – id of the llm assessment
- response: string – the text of the evaluation
- created_at: timestamp – time when this evaluation was created
free_form_evaluations.jsonl
Contains all free form evaluations generated for this team. Each object has the following format:
- evaluation_id: int – id of the email evaluation
- submission_id: int – id of the questionnaire submission
- question_id: int – id of the free-form question
- assessment_id: int – id of the llm assessment
- response: string – the text of the evaluation
- created_at: timestamp – time when this evaluation was created
Comparing logs from multiple exercises
The logs are constructed in a way that should allow for simple comparison of logs from multiple
exercises with the same definition.
It has to be the exact same upload of the definition,
otherwise the IDs of injects, tools and other definition data will not match.
Between different runs of the same definition, the exercise_injects.jsonl,
exercise_milestones.jsonl and exercise_tools.jsonl files will be the same.
The email_participants.jsonl will not match because email participants
need to be re-generated for every exercise.
However, the definition address they are linked to will have the same ID in all runs.