If you’ve read my previous articles on employing AI to support various aspects of construction safety, you’ll have noticed that I primarily rely on ChatGPT as the go-to generative AI tool for text generation. I have tried other models. But until now, other models haven’t shown themselves to be as useful, creative, reliable or fast as the latest ChatGPT model. That may have changed with the release of Anthropic’s latest model, Claude 3.5 Sonnet.
According to Anthropic, Sonnet “shows marked improvement in grasping nuance, humour, and complex instructions, and is exceptional at writing high-quality content with a natural, relatable tone.” Anthropic also claims that Sonnet performs better than models like GPT 4o on various benchmark tests across categories like math, reasoning, coding, and visual reasoning. To test these claims, I decided to put Sonnet up against ChatGPT-4o on several construction safety tests. We begin with procedure writing, followed by causation evaluation, then vision capability.
Test 1 – Procedure Writing
I prompted both models to “create a safe work procedure based on BC Occupational Health and Safety Regulations for using a hammer drill to drill holes for anchor bolts in concrete.”
Here is what each model provided back.
3.5 Sonnet |
GPT-4o |
Safe Work Procedure: Using a Hammer drill for Concrete DrillingPurpose
This procedure outlines the safe use of a hammer drill for drilling holes in concrete to install anchor bolts, in accordance with BC Occupational Health and Safety Regulations. Personal Protective Equipment (PPE)· Safety glasses or face shield (WorkSafeBC Regulation 8.14) · Hearing protection (WorkSafeBC Regulation 7.7) · Steel-toed safety boots (WorkSafeBC Regulation 8.22) · Dust mask or respirator (WorkSafeBC Regulation 8.32) · Hard hat if overhead hazards are present (WorkSafeBC Regulation 8.11) · Work gloves Pre-Use Inspection1. Inspect the hammer drill for any damage or defects (WorkSafeBC Regulation 4.3) 2. Check that the drill bit is sharp and properly secured 3. Ensure all guards are in place and functioning correctly 4. Verify that the electrical cord is in good condition with no exposed wires 5. Test the ground fault circuit interrupter (GFCI) if using in damp locations (WorkSafeBC Regulation 19.15) Setup1. Clear the work area of trip hazards and debris 2. Ensure adequate lighting in the work area (WorkSafeBC Regulation 4.64) 3. Locate and mark any embedded electrical wires, plumbing, or other hazards before drilling (WorkSafeBC Regulation 20.79) 4. Set up any required fall protection if working at heights (WorkSafeBC Regulation 11.2) 5. Position yourself on a stable surface with good footing Operation1. Don all required PPE before starting work 2. Hold the hammer drill firmly with both hands 3. Apply steady pressure when drilling, but do not force the tool 4. Be prepared for the drill to bind or kick back suddenly 5. Keep the cord away from the spinning bit 6. Use dust collection systems or wet drilling techniques to minimize dust exposure (WorkSafeBC Regulation 6.75) 7. Take regular breaks to avoid fatigue and hand-arm vibration syndrome (HAVS) 8. Never disable or modify safety features on the tool Post-Use1. Unplug the hammer drill when not in use 2. Clean the tool and work area 3. Store the hammer drill in a secure, dry location 4. Report any equipment malfunctions or safety concerns to your supervisor Emergency Procedures1. In case of injury, seek first aid immediately 2. Report all incidents, no matter how minor, to your supervisor 3. Know the location of first aid kits and emergency contact numbers TrainingAll workers using a hammer drill must be trained in: · Proper use and handling of the tool · Relevant sections of the BC Occupational Health and Safety Regulations · This safe work procedure · Hazard recognition and control measures By following this procedure, workers can minimize the risks associated with using a hammer drill for concrete drilling. Always prioritize safety and consult with a supervisor if you have any concerns or questions.
|
Safe Work Procedure: Using a Hammer drill to Drill Holes for Anchor Bolts in ConcretePurpose
To establish a safe method for using a hammer drill to drill holes in concrete for anchor bolts, ensuring compliance with British Columbia Occupational Health and Safety (OHS) Regulations. ScopeThis procedure applies to all employees who will be using a hammer drill for drilling holes in concrete on the job site. Responsibilities
Personal Protective Equipment (PPE)
Pre-Operation Checklist
Operating Procedure1. Preparation:
2. Starting the Drill:
3. Drilling:
4. After Drilling:
Emergency Procedures
Maintenance
Documentation
Review
By following this safe work procedure, workers can effectively minimize risks associated with using a hammer drill for drilling holes in concrete, ensuring a safe and compliant work environment.
|
We can see that the two procedures differ in important ways.
• Sonnet’s procedure includes references to the BC Occupational Health and Safety Regulation. Although the references are imperfect—they don’t always refer to the exact subsection of the Regulation that is most appropriate—they are factually correct. This is impressive. Other models have failed to reliably reference the correct sections of resource materials like codes, standards, and regulations.
• Sonnet prescribed a dust collection system. I was disappointed to see GPT 4o miss this important control measure. Silica exposure is a big deal in my jurisdiction and any procedure for an activity that produces silica dust must address the hazard.
• GPT-4o included responsibilities. The GPT-4o procedure included supervisor and worker responsibilities. Although not critical for all procedures, I appreciate that the GPT-4o procedure included this detail at the beginning.
• Sonnet warned me about kickback. Having used hammer drills myself, I know the bit may bind and the drill body can kick back. This can cause serious injuries if not anticipated. The fact that Sonnet provided this warning, demonstrates superior reasoning and research abilities.
Test 2 – Evaluating Causation
I provided both models with an incident description to test their ability to analyze an incident and suggest factors that may have contributed to it. I used this simple one-shot prompt:
“Provide a thorough analysis of the incident described below. Provide a comprehensive list of potential contributing factors for this incident. The driver of a garbage truck was on a highway, apparently unaware that the garbage box had become elevated during travel. The garbage box hit the side of a pedestrian overpass. The impact forced the pedestrian overpass off one support pillar, which was designed to be a "floating end." The other end, designed as a "fixed" end, did not dislodge. The overpass then fell onto the cab of the truck. As the overpass settled, the driver was fatally crushed.”
3.5 Sonnet |
GPT-4o |
Analysis: This incident occurred when a garbage truck with an elevated garbage box collided with a pedestrian overpass on a highway. The collision caused the overpass to partially dislodge from its support structure, resulting in it falling onto the truck's cab and fatally crushing the driver. The sequence of events can be broken down as follows: 1. The garbage truck was traveling on a highway with its garbage box in an elevated position. 2. The driver was apparently unaware of the box's elevated state. 3. The elevated garbage box collided with the side of a pedestrian overpass. 4. The impact dislodged the overpass from one of its support pillars (the "floating end"). 5. The other end of the overpass (the "fixed end") remained in place. 6. The overpass fell onto the cab of the truck. 7. As the overpass settled, it crushed the cab, fatally injuring the driver. This incident highlights several critical areas of concern, including vehicle safety systems, driver awareness, infrastructure design, and the potential consequences of collisions with elevated structures. Potential Contributing Factors: 1. Vehicle Malfunction: o Possible failure of the hydraulic system controlling the garbage box position o Potential issues with warning systems meant to alert the driver of an elevated box 2. Driver Error: o Lack of awareness of the garbage box's position o Possible distraction or fatigue leading to reduced situational awareness o Failure to perform proper pre-trip or enroute inspections 3. Training Deficiencies: o Inadequate training on vehicle operation and safety checks o Insufficient emphasis on the importance of monitoring vehicle components during travel 4. Operational Procedures: o Lack of standard operating procedures for ensuring the garbage box is lowered before highway travel o Insufficient checks or verifications before entering highway traffic 5. Vehicle Design: o Absence or failure of failsafe mechanisms to prevent travel with an elevated box o Lack of clear visual or audible indicators of box position visible to the driver 6. Infrastructure Design: o The "floating end" design of the overpass, while possibly intended for safety or structural reasons, may have contributed to its collapse o Insufficient clearance height for the overpass, considering the potential for over-height vehicles 7. Road Design and Signage: o Possible lack of warning signs about low clearance ahead o Insufficient measures to prevent over-height vehicles from accessing this route 8. Maintenance Issues: o Potential pre-existing issues with the garbage truck's hydraulic or warning systems o Possible deterioration or damage to the overpass structure that made it more susceptible to impact 9. Organizational Factors: o Inadequate safety culture within the waste management company o Insufficient emphasis on regular vehicle inspections and maintenance o Possible time pressures or scheduling issues leading to rushed operations 10. Regulatory Oversight: o Potential gaps in regulations regarding vehicle height monitoring systems o Insufficient enforcement of existing safety regulations for commercial vehicles 11. Emergency Response Design: o Lack of automatic systems to alert authorities of a bridge strike o Absence of impact-absorbing structures to mitigate collision effects 12. Human Factors: o Possible issues with the ergonomics of the truck cabin, making it difficult for the driver to monitor the garbage box position o Cognitive overload due to multiple tasks required during garbage collection operations 13. Environmental Factors: o Potential adverse weather conditions affecting visibility or driver perception o Possible glare or lighting conditions that made it difficult to notice the elevated box 14. Communication Breakdown: o Lack of systems for other drivers to alert the garbage truck driver of the dangerous situation 15. Previous Incident History: o Possible prior near-misses or minor incidents that were not properly investigated or acted upon This comprehensive list of potential contributing factors covers various aspects of the incident, from immediate causes to underlying systemic issues. A thorough investigation would be necessary to determine which of these factors played a significant role in this specific incident. Addressing these factors could help prevent similar tragedies in the future.
|
Incident Analysis OverviewThe incident involves a garbage truck traveling on a highway with an elevated garbage box that collided with a pedestrian overpass. This collision led to the dislodgment of one end of the overpass, causing it to fall onto the truck's cab and resulting in the fatal injury of the driver. Contributing Factors1. Driver Awareness and Training:
2. Mechanical Issues:
3. Operational Protocols:
4. Design and Engineering Factors:
5. Environmental Conditions:
6. Human Factors:
7. Regulatory and Compliance Issues:
Comprehensive List of Contributing Factors1. Driver-Related:
2. Mechanical and Equipment:
3. Operational Protocols:
4. Design and Engineering:
5. Environmental Conditions:
6. Regulatory and Compliance:
Recommendations for Prevention1. Enhanced Training Programs:
2. Improved Mechanical Systems:
3. Robust Operational Protocols:
4. Design Improvements:
5. Enhanced Regulatory Oversight:
By addressing these factors and implementing the recommended measures, similar incidents can be prevented in the future, enhancing the safety of both drivers and infrastructure.
|
In this test, both models performed well, but I give a slight edge to Sonnet for the reasons below.
1. Structure and organization:
o Claude 3.5 Sonnet's response is more detailed and structured, clearly separating analysis from potential contributing factors.
o GPT-4's response includes an overview, contributing factors, and recommendations for prevention. A bit more holistic, but less detailed than Sonnet’s.
2. Depth of analysis:
o Claude 3.5 Sonnet provides a more extensive list of 15 potential contributing factors, each with multiple sub-points.
o GPT-4 offers a more condensed list of 6 main contributing factors, with some sub-points under each.
3. Focus areas:
o Claude 3.5 Sonnet includes factors such as emergency response design, communication breakdown, and previous incident history, which are not explicitly mentioned in GPT-4's response.
o GPT-4 emphasizes operational protocols and regulatory compliance as distinct categories.
4. Level of detail:
o Claude 3.5 Sonnet generally provides specific examples and potential scenarios under each contributing factor.
o GPT-4's response is more concise but still covers the main areas of concern.
5. Approach to human factors:
o Claude 3.5 Sonnet includes a separate category for human factors, discussing ergonomics and cognitive overload.
o GPT-4 incorporates human factors mainly under the driver-related category.
6. Environmental considerations:
o Both models mention environmental factors, but Claude 3.5 Sonnet provides more specific examples (e.g., glare, lighting conditions).
7. Sequence of events:
o Claude 3.5 Sonnet provides a detailed, step-by-step sequence of the incident.
o GPT-4 includes a brief overview of the incident without breaking it down into specific steps.
In summary, both models provide comprehensive analyses. Claude 3.5 Sonnet's response is more detailed, covering a wider range of potential factors. On the other hand, GPT-4's response is more concise and action-oriented, including specific recommendations for prevention.
Test 3 – Vision
One obvious use case for AI vision in the workplace is hazard identification and analysis. Like GPT-4o, Claude 3.5 Sonnet offers vision capabilities. I provided both models with an image from my Regulator’s “What’s Wrong with this Photo?” publication to see which one proves a superior hazard identifier in the workplace.
Prompt: “Review this photo carefully and identify all the hazards you see. Where applicable, provide a reference to a specific section of an OSHA standard that would apply. Provide recommendations for corrective actions.”
3.5 Sonnet |
GPT-4o |
I'll review the image and identify potential hazards, referencing OSHA standards where applicable, and provide recommendations for corrective actions. 1. Lack of eye protection: The workers are not wearing safety glasses or goggles. OSHA Standard: 29 CFR 1926.102(a)(1) - Eye and face protection Recommendation: Provide and ensure workers wear appropriate eye protection. 2. Tripping hazards: There are various materials and debris scattered on the ground, including wood planks and metal pieces. OSHA Standard: 29 CFR 1926.25(a) - Housekeeping Recommendation: Implement regular housekeeping practices to keep the work area clear of debris. 3. Potential electrical hazards: There appears to be a power tool or electrical device on the workbench, but the condition of cords is not visible. OSHA Standard: 29 CFR 1926.405 - Wiring methods, components, and equipment for general use Recommendation: Ensure all electrical equipment is properly maintained and cords are in good condition. 4. Improper use of hard hats: While workers are wearing hard hats, one worker's hard hat appears to be worn backwards. OSHA Standard: 29 CFR 1926.100(a) - Head protection Recommendation: Train workers on proper use and wearing of hard hats. 5. Inadequate fall protection: The image shows an elevated work area without visible guardrails or fall protection systems. OSHA Standard: 29 CFR 1926.501 - Duty to have fall protection Recommendation: Install guardrails or implement a fall protection system for elevated work areas. 6. Potential respiratory hazards: One worker is using what appears to be a power tool that may generate dust, but no respiratory protection is visible. OSHA Standard: 29 CFR 1926.103 - Respiratory protection Recommendation: Assess the need for respiratory protection and provide if necessary. 7. Improper material storage: Materials like wood planks are not properly stored or stacked. OSHA Standard: 29 CFR 1926.250 - General requirements for storage Recommendation: Implement proper material storage practices, including stacking and securing materials. 8. Lack of proper work zone delineation: The work area doesn't appear to be clearly marked off from other areas. OSHA Standard: 29 CFR 1926.200 - Accident prevention signs and tags Recommendation: Use barriers, tape, or signs to clearly delineate the work zone. These recommendations would help improve safety on this construction site by addressing the identified hazards and bringing practices in line with OSHA standards.
|
I have carefully reviewed the photo and identified several hazards along with the relevant OSHA standards and recommendations for corrective actions: 1. Power Tools Usage:
2. Trip and Fall Hazards:
3. Personal Protective Equipment (PPE):
4. Electrical Hazards:
5. Fall Protection:
6. Manual Handling:
7. Hard Hat Usage:
8. Tool Inspection:
Summary of Recommendations:
By addressing these hazards and following OSHA standards, the safety of the workers can be significantly improved.
|
Interestingly, both models missed several of the hazards in the photo including one worker’s untied shoelaces, the use of foam earplugs around a noisy jackhammer, and a meddling worker on his cell phone. Both models also identified a fall hazard that doesn’t exist. GPT-4o wasn’t sure if all workers were wearing hard hats, thought they were, and Sonnet said that one of the workers was wearing their hard hats backwards, which they weren’t. There are likely other misses and errors in both model’s analyses but suffice it to say that AI vision models still have a way to go until they are reliable enough to provide trustworthy analyses of our workplaces.
Conclusion:
In conclusion, the comparison between Claude 3.5 Sonnet and GPT-40 demonstrates the rapid advancement of AI capabilities. Both models showed impressive abilities in analyzing safety procedures, incident causation, and hazard identification. Claude 3.5 Sonnet appeared to have a slight edge in terms of the depth and specificity of the analysis.
The test also revealed limitations in both models, particularly in visual hazard identification. In terms of my own use of the models, I probably will be using Sonnet more. Besides edging out GPT-4o in these types of tests, I also like its user interface and the unique features it offers.