首页 馆藏资源 舆情信息 标准服务 科研活动 关于我们
现行 ASTM F3263-17
到馆提醒
收藏跟踪
购买正版
Standard Guide for Packaging Test Method Validation 包装测试方法验证标准指南
发布日期: 2017-12-15
1.1 本指南提供了信息,以澄清使用包装测试方法的组织以及通过实验室间研究(ILS)验证包装测试方法的过程,解决实验室间研究(ILS)和组织特定方法的共识标准。 1.1.1 ILS讨论的重点是编写和解释测试方法精度声明,以及分析和陈述结果的替代方法。 1.2 本文件为定义和开发变量和属性数据应用程序的验证提供了指导。 1.3 本指南提供了有限的统计指导;然而,本文件并不打算给出所有包装类型和测试方法的具体样品尺寸。 重点是ASTM和其他组织已经开发的参考文件中有效包含的统计技术。 1.4 本标准并非旨在解决与其使用相关的所有安全问题(如有)。本标准的用户有责任在使用前制定适当的安全、健康和环境实践,并确定监管限制的适用性。 1.5 本国际标准是根据世界贸易组织技术性贸易壁垒(TBT)委员会发布的《关于制定国际标准、指南和建议的原则的决定》中确立的国际公认标准化原则制定的。 ====意义和用途====== 4.1 通过实验室间研究(ILS)和特定于组织的方法解决共识标准。在许多情况下,测试方法需要验证,以便能够依赖结果。这必须在执行测试的组织中完成,但也可以在制定实验室间研究(ILS)标准的过程中进行,ILS不能替代在执行测试的组织中执行的验证工作。 4.1.1 测试机构的验证- 测试执行机构的验证包括规划、执行和分析研究。规划应包括测试方法范围的描述,其中包括测试设备的描述及其将用于的样本测量范围、样本选择的理由、样本量以及方法选择的理由。 4.1.2 ILS研究的目标- ILS研究(per E691 -14) 重点不是测试方法的开发,而是在开发阶段成功完成后收集测试方法精度声明所需的信息。然而,实验室间研究中获得的数据可能表明,需要进一步努力改进试验方法。在这种情况下,精度被定义为测试方法的重复性和再现性,通常称为量具R&R。对于实验室间研究,重复性涉及在一个设施中操作单个测试系统的一名评估人员内的相关变化,而再现性涉及每个实验室之间的变化,每个实验室都有自己独特的测试系统。 重要的是要了解,如果以这种方式进行ILS,则不会评估同一实验室中评估人员和测试系统之间的再现性。 4.1.3 ILS过程概述- 基本上,ILS过程包括规划、执行和分析旨在评估测试方法精度的研究。从ASTM的角度来看,需要执行以下步骤:;创建一个任务组,确定ILS协调员,创建实验设计,执行测试,分析结果,并将结果精度声明记录在测试方法中。有关如何进行ILS的更多详细信息,请参阅 E691 -14. 4.1.4 书写精度和偏差陈述- 在为ASTM标准编写精度和偏差声明时,最低要求是 E177 -14之后。然而,在某些情况下,以标准用户更容易理解的形式呈现信息也可能有用。示例见 4.1.5 在下面 4.1.5 分析和陈述结果变量数据的替代方法: 4.1.5.1 能力研究: (1) 大于2.00的过程能力表明测试输出的总可变性(零件到零件加上测试方法)相对于公差应非常小。在数学上, (2) 注意,σ 全部的 在上述等式中包括σ 部分 和σ TM公司 . 因此,可以得出两个结论: (a) 该测试方法可以识别至少1/12的公差,因此测试方法分辨率足够,因此无需进行额外的分析,例如量规R&R研究。 (b) 相对于规格公差,测量是精确的。 (3) 此外,由于TMV能力研究需要两个或多个操作员使用一个或多个测试系统参与,高能力数字将证明操作员和测试系统的测试方法性能一致。 4.1.5.2 量规R&R研究: (1) 下面提出的%SV、%R&R和%P/T的验收标准来自行业范围内采用的测量系统要求。根据汽车工业行动组织(AIAG)测量系统分析手册(第4版,第78页),如果测试方法变化(σ),则可以接受测试方法 TM公司 )计数不到研究总变异的30%(σ 全部的 ). (2) 这相当于: 大于2.00的过程能力表明测试输出的总可变性(零件到零件加上测试方法)相对于公差应非常小。在数学上, (3) 当历史数据可用于评估过程的可变性时,我们还应具备: (4) 对于百分比P/T,另一个业界公认的做法是使用正态分布的中间99%表示人口。 5. 理想情况下,输出的容差范围应大于该比例。对于正态分布的人口,这表明: (5) 上述方程中的因子5.15是正态分布的双侧99%Z分数。因此: (6) 在实践中,这意味着具有高达6%P/T再现性的测试方法将有效评估给定设计的P/T。 4.1.5.3 功率和样本量研究: (1) 当使用统计测试比较两个或多个群体的平均值时,过度的测试方法可变性可能会掩盖真实差异(“信号”),并降低统计测试的能力。因此,可能需要较大的样本量来保持足够的功率( ≥ 80%)用于统计测试。当样本量太大而无法从业务角度接受时,应该在运行比较测试之前改进测试方法。因此,可以根据其对比较测试的功率和样本量的影响(例如2个样本T-测试)来决定是否接受比较测试方法。 4.2 属性测试方法验证: 4.2.1 属性测试方法验证的目标- 属性测试方法验证(ATMV)表明,向检查员提供的培训和工具使他们能够成功地区分好产品和坏产品。有两个标准用于衡量ATMV是否达到这一目标。主要标准是证明最大逃逸率β小于或等于其规定的阈值βmax。参数β也称为II型误差,即错误接受不合格设备的概率。第二个标准是证明最大虚警率α小于或等于其规定的阈值αmax。参数α也称为I型误差,即错误拒绝合格设备的概率。 4.2.2 ATMV过程概述- 本节介绍ATMV的典型工作方式。在属性测试方法验证中,进行了单盲研究,该研究由合格和不合格单元组成。当两个采样计划的要求都满足时,ATMV通过。第一个抽样计划证明测试方法满足最大允许β误差(逃逸率)的要求,第二个抽样计划证明测试方法满足最大允许α误差(误报率)的要求。换句话说,该测试方法能够证明其接受合格单元,并以高水平的有效性拒绝不合格单元。β误差抽样计划将完全由不合格单元组成。 每个检查员进行的beta试验总数 6. 它们的错误分类总数(被接受的不合格单元)需要小于或等于β误差抽样计划规定的失败次数。阿尔法误差抽样计划将完全由合格单元组成。每个检查员进行的阿尔法试验总数汇总在一起,其错误分类总数(被拒绝的合格单元)需要小于或等于阿尔法误差抽样计划规定的失败次数。 4.2.3 ATMV示例- 属性测试方法涵盖了广泛的测试。这些测试方法类别的示例列于 表1 . 表的右半部分包含返回定性响应的测试方法,表的左半部分包含提供可变测量数据的测试方法。 4.2.4 可变测量数据的ATMV- 尽可能将可变测试方法作为可变测量数据进行分析是一种良好的做法。然而,在某些情况下,测量数据更有效地被视为定性数据。示例:用于医疗器械的无菌屏障系统(SBS),要求的密封强度规格为1.0-1.5 lb./in。待验证。拉伸试验机用于测量密封强度,但其分辨率仅为0.01 lbs。因此,即使生产中很少有密封不符合规格,Ppk计算通常也会失败。 验证团队确定需要将数据视为属性,因此需要ATMV,而不是可变测试方法验证。 4.2.5 不言而喻的检查- 本节说明了上述定义中提出的不言而喻的检查要求。作为一种不言而喻的检查,缺陷在本质上是离散的,几乎不需要或根本不需要培训来检测。缺陷不能仅满足一个或另一个要求。 4.2.5.1 以下可被视为不言而喻的检查: (1) 当导线上的润滑度水平正确时,传感器指示灯亮起,否则当润滑不足时,传感器指示灯不亮起-由于测试设备正在为检查员创建二进制输出,并且说明很简单,因此这符合自身条件- 明显的但是,请注意,需要验证涉及设备的测试方法验证。 (2) 组件存在于组件中-如果组件的存在相当容易检测,则这是不言而喻的,因为结果是二进制的。 (3) 在组件中使用正确的组件-只要组件彼此不同,这是不言而喻的,因为结果是二进制的。 4.2.5.2 以下情况通常不会被视为不言而喻的检查: (1) 烧伤或热变色-除非部件过热时完全变色,否则本次检查将要求检查员检测变色痕迹,这无法满足离散条件要求。 (2) S形弯曲或Z形弯曲成形不当-部件放置在模板顶部,检查员验证部件是否完全在模板边界内。弯曲可以在多个位置从形状完美到完全超出边界,每个弯曲级别都在其间。因此,这不是一个离散的结果。 (3) 部件表面无刻痕–刻痕的大小可以从“放大后不可见”到“肉眼不可见”到“肉眼可见”。因此,这不是一个离散的结果。 (4) 部件表面无毛刺-由于手指上有老茧,检查员的触摸灵敏度不同,毛刺的锐度和暴露程度也不同。 因此,这既不是离散条件,也不是易于训练的指令。 (5) 部件开裂——裂纹的长度和严重程度各不相同,检查员看到视觉缺陷的能力也各不相同。因此,这既不是离散结果,也不是易于训练的指令。 4.2.6 ATMV步骤: 4.2.6.1 第1步-准备测试方法文件: (1) 在执行ATMV之前,确保设备鉴定已经完成或至少在待完成的验证计划中。 (2) 测试方法文件中要获取的设备设置示例包括环境或环境条件、显微镜的放大率、自动检测系统的照明和进给率、真空衰减测试的压力和洁净室的照明标准,这可能涉及在房间中读取勒克斯读数以表征照明水平。 (3) 与培训人员一起制作缺陷图片。还可以包括好产品的图片和缺陷的不太极端的示例,因为示例的频谱将为决策提供更好的分辨率。 (4) 在可能的情况下,视觉设计标准的放大倍数应与检查期间使用的放大倍数相同。 (5) 确保ATMV使用最新的视觉设计标准运行,并且它们是潜在缺陷的良好表示。 4.2.6.2 第2步-制定验收标准: (1) 确定哪些缺陷需要包括在测试中。 (2) 使用报废历史记录来识别每个缺陷代码或类型的频率。这也可能是仅由中小企业提供的信息。 (3) 不要试图在一个检查步骤中挤进太多缺陷。随着更多缺陷添加到检查过程中,检查员最终将无法检查所有缺陷,并且该阈值也可能在ATMV测试中显示出来。限制因产品类型和测试方法而异,但对于目视检查,15-20个缺陷可能是可以达到的最大数量。 4.2.6.3 第3步-确定每个缺陷所需的性能水平: (1) 如果在完成风险分析之前进行ATMV测试,建议的方法是使用更糟糕的结果或高风险指定。这需要与更保守评级相关的样本量增加进行权衡。 (2) 没有相关风险指数的故障模式可以根据验证团队同意的任何要求进行测试。如果某个部件或组件可以因特定的故障模式而报废,良好的商业意识是通过执行ATMV来确保检查有效。 (3) 销规是变量输出的一个例子,由于分辨率低,加上严格的规格限制,有时将其视为属性数据。在这种应用中,检验人员在测试之前接受培训,以了解可接受与不可接受的摩擦水平。 (4) 进货检验是变量数据通常被视为属性的另一个例子。将变量测量视为通过/失败结果可以允许使用较不复杂的测量工具,如模板,并且对检查员的培训更少。 然而,这些好处应该与可能需要的额外样本和信息丢失的程度进行权衡。例如,属性数据会说,以规范限制为中心的样本与规范限制内的样本没有什么不同。这可能会导致更大的下游成本和更困难的故障排除,以提高产量。 4.2.6.4 第4步-确定验收标准: (1) 参考贵公司预定义的信心和可靠性要求;或 (2) 请参阅中的图表示例 附录X1 . 4.2.6.5 步骤5–创建验证计划: (1) 确定样品中每个缺陷的比例。 (a) 虽然应该为缺陷比例在ATMV中的分布提供某种基本原理,但在选择比例方面有一定的灵活性。 因此,对于不同的产品和过程,可以采用不同的策略,例如30个中有10个缺陷零件或30个中有20个缺陷。样本成本以及与错误结果相关的风险会影响决策。 (b) 新产品通常无法获得报废生产数据。在这些情况下,使用类似产品的历史废料,或根据开发过程中观察到的工艺挑战估计预期废料比例。另一种选择是均匀地表示所有缺陷。 4.2.6.6 第6步-确定所需的检查员和设备数量: (1) 当试验数量较大时,考虑雇用三名以上的检查员,以减少试验所需的独特零件数量。 更多的检查员可以在不增加更多零件的情况下检查相同的零件,以实现额外的试验和更大的统计能力。 (2) 虽然这可能是最简单的方法,但检查员不需要全部查看相同的样本。 (3) 对于对检查员放置或设置夹具敏感的半自动检查系统,仍应使用多名检查员进行测试。 (4) 对于完全独立于检查员的自动检查系统,只需要一个检查员。然而,为了减少所需独特零件的数量,考虑控制其他变化源,如各种照明条件、温度、湿度、检查时间、日班/夜班和零件方向。 4.2.6.7 第7步-让检查员做好准备: (1) 测试前对检查员进行培训: (a) 向检查员解释ATMV的目的和重要性。 (b) 检查员培训应该是一个双向过程。验证小组应就检查文件中视觉标准、图片和书面描述的质量和清晰度征求检查员的反馈。 (1) 是否有任何灰色区域需要澄清? (2) 图表会比缺陷的实际图片更有效吗? (c) 审查临界样本。考虑将边界样本的图片/图表添加到视觉标准中。在某些情况下,功能缺陷和外观缺陷之间可能存在差异。这可能因方法/包装类型而异。 (d) 一些验证团队已经进行了干运行测试,以表征检查的当前有效性。 注意,如果相同的检查员参与两项测试,则不应将相同的样品用于干运行测试和最终测试。 4.2.6.8 第8步-选择一组具有代表性的检查员作为测试组: (1) 在某些情况下,例如现场转移,所有检查员对产品的熟悉程度大致相同。如果是这种情况,则根据检查员内部的其他可变性来源,例如他们的生产班次、技能水平或类似产品检验的经验,选择检查员的测试组。 (2) 被选中进行测试的检查员至少应该熟悉产品,否则这将成为一种过于保守的测试。例如,缺乏使用该产品的经验可能会导致误报增加。 (3) 选择了一组不同的检查员进行测试的文件。 4.2.6.9 第9步-准备试样: (1) 收集代表单位。 (a) 在开发过程的早期和经常收集代表性缺陷设备,为ATMV测试做好准备。此时,边缘样本尤其值得收集。但是,请注意,主题专家甚至无法就样本的好坏达成一致意见,这只会在测试中产生问题。相反,选择相对于验收标准而言代表“刚刚通过”和“刚刚失败”的样本。 (2) 例如,使用最佳判断,判断人造缺陷样本是否充分代表了密封过程、分布模拟或其他制造过程中自然出现的缺陷。 如果无法充分复制缺陷和/或发生率太低,无法为测试提供样本,这可能是一种情况,即可以忽略缺陷类型,并从测试中说明理由。 (3) 从总体规划中估计测试需要多少缺陷,并尝试获得测试所需估计样本数的1.5倍。这将允许剔除破损样品和不太理想的样品。 (4) 样品的可追溯性可能没有必要。对样本的唯一要求是它们准确地描述符合性或预期不符合性。然而,如果验证方法有困难,或者需要跟踪特定非目标的输出,那么捕获可追溯性信息可能有助于研究目的- 符合性。 (5) 最好有一个以上的SME来确认测试中每个样本的状态。请记住,培训师或生产主管也可能是过程缺陷类型的SME。 (6) 选择适合特定样本的存储方法。潜在的选择包括带有单独标签隔间的滑车箱、可重新密封的塑料袋和塑料瓶。有关预调节要求,请参阅标准测试方法。 (7) 在每个部件上写入密码可以成功隐藏缺陷类型,但这不是隐藏部件身份的有效方法。换言之,如果检查员能够记住样品的识别号以及他们在该样品上检测到的缺陷,则在第二次给检查员该样品时,测试已经受到影响。 如果每个检验员只查看一次每个样品,则在样品上放置代码不是问题。 (8) 视频测试是一些手动目视检查的另一种选择,尤其是当缺陷有可能随时间变化时,例如裂纹或异物。 (9) 如果产品非常长/大,例如导丝、导尿管、袋、托盘、容器闭合系统(罐和盖),并且相关缺陷仅在产品的特定部分,可以选择从样品的其余部分分离相关部分。如果长度或精细度等可减轻因素是使完整产品难以检验的因素,则应使用完整产品。示例:泄漏测试,其中包装中的液体可能会影响测试结果。 (10) 拍摄有缺陷样品的照片或视频,并将其保存在钥匙中以备将来参考。 4.2.6.10 第10步-制定协议: (1) 建议的协议部分 (a) 目的和范围。 (b) 参考正在验证的测试方法文件。 (c) 其他相关文件的参考清单(如适用)。 (d) TMV使用的设备、仪器、固定装置等的类型列表。 (e) TMV研究原理,包括: (1) TMV的统计方法; (2) 通过试验方法测量的特性和TMV涵盖的测量范围; (3) 测试样本和基本原理的描述; (4) 样本数量、操作员数量和试验数量; (5) 数据分析方法,包括将用于数据分析的任何历史统计数据(例如,用1计算百分比P/T的历史平均值)- 侧面规格限制)。 (f) TMV验收标准。 (g) 验证测试程序(例如,样品制备、测试环境设置、测试顺序、数据收集方法等)。 (h) 随机化方法 (1) 有多种方法可以随机化样本的顺序。在所有情况下,将随机化顺序存储在另一列中,然后对同一检查员第二次检查的每个样本重复第二个随机化列表,并将其附加到第一个存储的列表中。 (2) 考虑使用Excel、Minitab或在线随机数生成器来创建测试的运行顺序。 (3) 从容器中提取数字,直到容器为空,并记录订单。 (一) 一些公司对每个样本或测试的总时间限制施加时间限制,以便测试更能代表fast- 生产环境的节奏要求。如果使用,应在协议中注明。 4.2.6.11 步骤11–执行协议: (1) 在协议执行期间,确保遵守预调节要求。 (2) 避免检查员急于完成测试的情况。估计每个检查员需要多长时间并提前计划开始每个测试,在轮班时有足够的时间让检查员完成他们的部分,或传达检查员在测试期间可以去吃午饭或休息。 (3) 向检查员解释正在测试哪些检查步骤。澄清每个样品是否有一个以上的缺陷。然而,请注意,在测试过程中,样本上的多个缺陷可能会造成混淆。 (4) 如果第一个人未能正确识别缺陷的存在或不存在,则由业务/团队决定是否与其余检查员继续协议。完成协议将有助于确定问题是否普遍存在,这有助于避免下次再次失败。另一方面,立即中止ATMV可以为每个人节省大量时间。 (5) 如果出现故障,则在测试过程中更改采样计划不是好做法。 7. 例如,如果原始β误差采样计划为n=45,a=0,并且发生故障,则在测试期间将采样计划更新为n=76,a=1是不正确的,因为正在执行的采样计划实际上是一个双采样计划,n1=45,a1=0,r1=2,n2=31,a1=1。 这导致LTPD=5.8%,而不是原始计划中的5.0%。 (6) 如果缺陷样品受损,应准备备用样品。 (7) 与所有测试检查员同时运行测试是有风险的,因为管理员将负责跟踪每个未标记样本的检查员。 (8) 在每个检查员之后审查错误分类的样本,以确定检查员是否检测到了准备团队遗漏的缺陷。 4.2.6.12 步骤12–分析测试结果: (1) 错误缺陷代码或缺陷类型的报废: (a) 在某些情况下,检查员用协议中未包含的单词描述缺陷。验证团队需要确定使用的单词是否与该特定缺陷的任何列出的名称同义。 如果没有,那么审判就失败了。如果单词与缺陷匹配,则在报告的偏差部分记录例外情况。 (2) 不包括性能计算中的数据: (a) 如果在测试完成后发现缺陷,有两个建议选项。首先,如有必要,可在随后的更换零件上对检查员进行测试。或者,如果单个试验的结果不会改变抽样计划的最终结果,则可以绕过替代试验。该理由应记录在报告的偏差部分。 (1) 例如,考虑n=160,a=13的alpha采样计划,该计划旨在满足12%的alpha错误率。在所有检查员完成测试后,确定其中一个合格样品有缺陷,对该样品进行的六次试验中有五次发现了该缺陷,而其中一次称其为合格样品。 这六项试验的结果需要重新整理,但它们需要重复吗?如果剩余的154个符合标准的试验几乎没有失败,仍然满足所需的12%的阿尔法错误率,则无需更换试验。同样的原理也适用于β误差抽样计划中的缺陷样本。 (2) 如果真空衰变测试样品未能通过泄漏测试,在这种情况下,作为协议的一部分,流程可能是将样品发送回创建有缺陷样品的公司,以确认其确实仍有缺陷。如果发现不再代表所需的缺陷类型,则样本将被排除在计算之外。 4.2.6.13 步骤13–完成验证报告: (1) 当验证测试通过时: (a) 如果ATMV很难通过或需要特殊检查员培训,请考虑添加鉴定人熟练程度测试,以限制有资格进行过程检查的人员。 (2) 当验证测试失败时: (a) 重复验证 (1) ATMV的故障次数没有限制。然而,应该应用一些常识,因为在通过方法之前,大量尝试似乎是一种测试,并可能成为审计标志。因此,一个好的经验法则是在执行之前进行试运行或可行性评估,以优化评估师培训和测试方法,以降低方案失败的风险。如果ATMV失败,验证团队成员可以自己进行测试。 如果验证团队通过了,那么就没有向检查员清楚地传达某些信息,需要额外的面谈来确定混淆。如果验证团队也未能通过ATMV,这强烈表明目视检查或属性测试方法尚未准备好发布。 (b) 用户错误 (1) ATMV测试误差示例包括: (a) 显微镜的放大倍数设置错误。 (b) 由于样本混淆,ATMV期间的样本可追溯性受损。 (2) 测试失败表明,需要减少检查员之间的可变性。关键是要理解测试失败的原因,纠正问题并记录理由,以便在通过方法之前,后续测试不会显示为测试。 (3) 如果测试的检验员与之前的ATMV中的检验员相同,则尽可能不要将相同的样本用于后续ATMV。 (4) 采访任何犯下分类错误的检查员,以了解他们的错误是由于对验收标准的误解还是仅仅是失误。 (5) 为了提高缺陷检测/测试方法的熟练程度,以下是一些建议的最佳实践: (a) 在作业指导书中为检查员定义检查顺序,例如从近端移动到远端或先内后外。 (b) 在检查零部件或部件上的多个位置是否具有特定属性时,请提供一个视觉模板,其中包含检查期间要遵循的顺序编号。 (c) 将显微镜图片转移到视频屏幕以便于查看。 (d) 如果在一个检查步骤中要查找的缺陷类型太多,则可能会遗漏一些缺陷。将与过程无关的任何检查移回可能产生缺陷的过程的上游。 (6) 当检查员误解了标准时,需要更好地区分好产品和不合格产品。以下是一些想法: (a) 与检查员和培训师一起审查缺陷的视觉标准。 (b) 确定图表是否比照片更具信息性。 (c) 更改显微镜上的放大倍数。 (d) 如果ATMV因边界缺陷被错误接受而失败,则将制造验收标准阈值滑动到更保守的水平。 这可能会增加alpha错误率,而alpha错误率通常具有更高的错误率余量,但beta错误率应该会降低。 (7) 考虑使用属性一致性分析来帮助确定ATMV故障的根本原因,因为这是一个很好的工具,可以评估多个评估师给出的标称或顺序评级的一致性。该分析将计算每个估价师的重复性和估价师之间的再现性,类似于可变量规R&R。 4.2.6.14 第14步-验证后活动: (1) 试验方法变更 (a) 如果要求、标准或测试方法发生变化,则需要评估其他两个因素的影响。 (b) 例如,许多属性测试方法(如目视检查)对被测试设备的形状、适合性或功能没有影响。 因此,很容易忽视的是,需要仔细评估设计图、视觉设计标准、视觉过程标准中记录的测试方法标准的更改,以了解更改可能对设备性能产生的影响。 (c) 一个好的做法是将运营和设计代表召集在一起,审查拟议的变更,并考虑变更的潜在结果。 (d) 例如,对设计验证构建期间使用的初始目视检查标准的更改可能无法在进行分布模拟过程之前识别缺陷。在初始检查期间遗漏的应力可能因暴露于与分布模拟过程相关的冲击、振动和热循环而加剧。 因此,了解上游使用的视觉标准的变化可能对下游检查产生的影响很重要。 (2) 增强测试方法验证有时在验证ATMV后会发现新的缺陷。有多种方法可以验证新故障模式的检测。 (a) 选项#1–添加新标准,重复整个验证。 (1) 优点:最终结果是完整、独立的验证,完全代表了最终的检查配置。 (2) 缺点:这是一个过度的工作水平,相当于整个ATMV的重新验证,所有这些都是为了增加一个单一的检查标准。此外,如果ATMV因其中一个pre故障- 现有故障模式,这使验证的ATMV以及通过该测试方法批准的历史生产设备受到质疑。 (b) 选项#2–运行全功率ATMV,只有与新标准相关的缺陷。 (1) 优点:由于检测其他缺陷的能力已经得到验证,这种方法将重点放在新引入的标准上。 (2) 缺点:测试样本中缺陷类型的分布应基于比例历史废料表示。因此,仅包含一个缺陷的ATMV将完全压倒原始ATMV中其他缺陷代码的样本量。第二,如果可用的检查员数量有限,那么很快就会成为寻找足够不合格样本的负担。 (3) 注:考虑对缺陷类型进行基于风险的分析,以确定适当的样本量。 (c) 选项#3–在较小的样本量下仅使用新缺陷运行增强ATMV。 (1) 优点:这种方法结合了前两种选择的优点,没有缺点。重点是采用新检查标准的缺陷,但样本量应充分利用检查员的知识。增强报告可以指向原始ATMV和更新的MVR,以便未来的ATMV能够意识到要包含在未来报告中的新缺陷。 (2) 缺点:有人可能会争辩说,检查员没有同时测试新的和旧的缺陷。然而,为了评估alpha和beta错误率,检查员仍然被呈现为合格和不合格零件的混合体,因此增强方法仍然挑战检查员对视觉要求的理解。 (d) 选项#4–根据随时间推移获得的额外工艺知识运行增强ATMV。 (1) 优点:可以减少样本量,因为随着时间的推移收集了更多的数据,可以更好地让您了解缺陷的真实发生率。 (2) 缺点:可能会增加样本量您有更多关于缺陷发生率的数据。 (3) 在适当的情况下,利用属性测试方法是一种智能、高效的方法。特别是,当被检查的组件与之前验证的组件相同或足够相似时,应考虑利用。在下一代设备中使用部件或将产品转移到其他站点时,可能会发生这种情况。 (a) 利用ATMV的建议要求: (1) 使用相同的工具、检查标准和培训方法。例如,显微镜应设置在相同的放大倍数。如果正在放松或拧紧新装置的检查,则应重复ATMV。 (2) 新产品仍然需要满足最低性能要求。因此,如果旧产品上的缺陷被评为低风险指数,但新产品上的同一缺陷被视为中等或高风险,那么现有的自动变速器不能被利用,除非原始自动变速器被测试到高风险要求水平。 (3) 检验过程或验收标准不得有任何更改或“改进”。一个人的澄清是另一个人的困惑。 (4) 利用时不一定需要协议;只需要一份报告。这里的信息是,应该有一种一致的方式来记录测试方法已被利用,但该活动不需要触及协议。相反,可以只更新原始测试方法报告,说明其他产品、生产线或现场也在使用此验证测试方法。 4.3 可变测试方法验证: 4.3.1 可变试验方法验证的目标: 4.3.1.1 在假设测试结果准确和精确之前,应验证可变测试方法。所有测试中都存在测试方法变化,应评估其对测试结果和/或产品规范公差的影响。 运行变量测试方法验证的目的是: (1) 证明测试方法的精度足以满足所测量的测试特性。 (2) 提供客观证据,证明测试设备的一致运行和测试程序的一致性能。 4.3.1.2 可变TMV涵盖了使用量规或测试设备的精度要求。尽管本文件中未明确定义量规或测试设备的精度,但应根据贵公司的校准程序和设备鉴定流程或设备制造商的建议进行评估。可变TMV的成功结果,结合校准和设备鉴定的完成,将确保测试方法(包括测试设备)进行准确和精确的测量。 4.3.2 试验方法变化: 4.3.2.1 一般来说,观察到的测试结果包括两个变化来源:部分到部分的变化和测试方法的变化。观察到的零件之间的关系以及试验方法变化可以用数学方法描述: 或者,如图所示 图1 . (1) 能力研究 (a) 在缺乏统计驱动的样本量的情况下,延迟从两个或多个评估师那里收集至少n=30个测量数据点。 (b) 如果现有数据至少包含30个样本,并且在这种情况下不需要TMV协议,则可以使用现有数据。然而,TMV报告中应包括对原始数据源的引用。 (c) 如果测试方法由一名指定的评估师执行,则允许仅使用该评估师提供的数据,以满足n=30的样本量要求。 然而,试验方法文件和试验方法验证报告应明确说明这一限制。为了在测试方法之后增加额外的受过培训的评估人员,需要重新评估原始评估人员的数据,然后修订测试方法文件和测试方法验证报告。 (2) 量具重复性和再现性研究 (a) 量规重复性和再现性(量规R&R)研究是一种统计方法,用于: (1) 估计试验方法的重复性、再现性和总可变性; (2) 评估测试方法的精度相对于被测零件或产品的公差范围,或相对于整个过程变化或总研究变化是否足够。 (b) 量具R&R研究的样本量 (1) 量规R&R研究的目标是估计σ 重复性 和σ 再现性 . 样本量确实影响标准差(或方差)估计: 样本量 影响 #零件数量 σ 重复性 和σ 再现性 #运营商数量 σ 再现性 #试验数量 σ 重复性 (2) 一般来说,样本量越大,标准差估计越准确。这是因为样本标准差统计遵循卡方分布,自由度与样本量有关。较大的样本量(较高的自由度)将导致扩散较小的卡方分布,因此标准偏差的置信区间较窄。 (3) 参考书《量规R&R的设计和分析》中描述的量规R&R样本量的详细数学评估 8. 让我们得出以下结论: (#零件)×(#操作员)×(#试验-1) ≥ 15 建议3名或3名以上的操作人员参与量具R&R研究,以降低获得再现性夸大估计的风险,该估计具有更严格的阈值。 (c) 无损检测方法 (1) 至少使用3个测试样品进行量规R&R研究。如果TMV的范围包括多个产品/模型/标签,请使用以下三种方法之一选择样本进行测试: (a) 使用对TMV验收标准提出最大挑战的产品/模型/标签样本(例如,测试方法可变性高但公差范围最窄的产品)。 ) (b) 使用括号技术。对代表可能极端情况的产品/型号/标签进行两次或两次以上的量规R&R研究。使用该策略是必要的,以充分确保在这些极端条件包围的整个频谱中的测试方法性能。 (c) 将不同产品/型号/标签的样本汇集到一个量具研发研究中。当合并的产品/模型/标签具有相同的公差,或者可以假设不同产品/模型/标签之间的测试方法变化具有可比性时,该方法适用。 (2) 从所有合格的测试方法评估人员中随机选择至少2名评估人员,或涵盖广泛的经验水平。 (3) 跟随 表3 确定每个评估人员和每个测试样本的最小试验次数(重复读数)。 (4) 随机化测试顺序,必要时准备数据表。 (d) 破坏性试验方法 (1) 对于破坏性试验方法,很难将试验方法的重复性与零件之间的变化分开。然而,在破坏性量具R&R研究中,仍然可以评估试验方法的重复性和再现性。可以使用四种方法: (a) 使用代理。替代品是代表试验方法预期用途的实际产品/零件的替代品。替代物可以作为无损检测重复测量,或者与测量误差的重复性分量相比,替代物的部分到部分可变性可能足够小,从而对总测量误差的影响最小。 为了使用替代物,TMV协议中应包括替代物的描述及其对测试方法预期用途代表性的证明。替代样本数量、评估人员数量和试验数量应符合 表3 . 例如,使用不同强度的磁铁来表示要研究的密封强度范围是一种可能的想法。 (b) 使用主组。主组(也称为主批次,主批次)是具有较小零件间变化的同质单元的集合(例如,从单个制造批次中提取的单元)。在Gage R&R研究中,每个主组将被视为假设的“测试零件”,主组内的每个单元将被视为试验。 注意,在该方法下,估计的重复性不仅包括真实的测试方法重复性,还包括主组内的零件间变化。为了使用主组方法进行量规R&R研究,应在TMV协议中明确定义并记录主组。主团队的数量、评估人员的数量和试验的数量建议参见 表3 . (c) 如果无法推定主组的同质性,仍然可以进行量具R&R分析,以使用主组方法估计试验方法的再现性。然而,通过在主组中包含较大的零件间差异,将过度估计测试方法的重复性。在这种情况下,可以使用替代品、标准或其他技术进行单独研究,以估计测试方法的重复性。 测试方法的总标准偏差将根据平方根(RSS)计算 9 方法: (d) 在某些情况下,当测试样本的测量不完全可重复时,可以使用工程或统计模型表征后续测量的变化。这些模型可能有助于估计真实的测试方法重复性。使用这种方法通常涉及先进的科学或统计方法,用户应咨询中小企业以获得帮助和批准。 注1: 对于常用的破坏性包装试验方法,如ASTM F88热封剥离试验,重要的是要了解,在拉伸试验机(该方法的仪器)上进行试验方法验证时,应关注方法的变化,而不是材料。 因此,如果可能,尝试使用变化尽可能小的材料,仍然覆盖将要测试的力范围。 (3) 功率/样本量研究-功率/样本量研究仅适用于验证比较测试方法,该方法用于使用统计测试(例如,1样本t检验、2样本t检验或ANOVA)比较群体的平均值。进行功率/样本量研究的目的是确保测试方法的可变性不会掩盖应通过具有足够功率的统计测试检测到的均值差异(“信号”)( ≥ 80%) 10 . (a) 进行功率/样本量研究: (1) 建立用于比较方法的实际意义Δ,并将Δ与基本原理记录在TMV方案和/或报告中。 (2) 根据至少n=15个测量值估计总标准偏差。这些测量值应从5个或更多独特样本中采集。 (3) 计算统计测试所需的样本量n,以检测Δ ≥ 80%功率。 (4) 如果样本量n可接受,则接受测试方法。 (4) 使用现有数据现有数据可用于可变TMV,前提是其满足以下要求: (a) 现有数据应以受控方式存储,并可完全追溯到其原始来源(例如,存储在DHF中的数据;在经批准的电子实验室笔记本(ELN)中);或作为批准的变更控制文件的附件)。 (b) 应从测试方法验证范围内或代表测试方法预期用途的测量产品或零件中收集现有数据。 (c) TMV报告可以包括使用的现有数据如何满足这些要求的基本原理。 4.3.3.4 第4步-完成TMV协议: (1) 除使用“仅报告”方法外,在执行验证测试之前,根据ISO 11607,需要发布TMV协议。VTMV协议至少可以包括以下内容: (a) TMV协议的目的和范围。 (b) 参考正在验证的测试方法文件。 (c) 其他相关文件的参考清单(如适用)。 (d) TMV使用的设备、仪器、固定装置等的类型列表。 (e) TMV研究原理,包括: (1) TMV的统计方法; (2) 通过试验方法测量的特性和TMV涵盖的测量范围; (3) 测试样本和基本原理的描述; (4) 样本数量、评估人员数量和试验数量; (5) 数据分析方法,包括将用于数据分析的任何历史统计数据(例如,计算单侧规格限制下的百分比P/T的历史平均值)。 (f) TMV验收标准。 (g) 验证测试程序(例如,样品制备、测试环境设置、测试顺序、数据收集方法等)。 4.3.3.5 第5步-准备评估人员: (1) 从所有合格的测试方法评估师中选择所需数量的评估师,涵盖广泛的技能和经验水平。 (2) 在执行TMV协议之前,对评估人员进行测试方法培训。 培训应包括测试环境的任何特殊设置和TMV特有设备的操作,在运行测试方法之前,评估人员将对此负责。应根据公司培训要求记录培训。 4.3.3.6 步骤#6–准备试样: (1) 收集所需数量的测试样品,并为验证测试做好准备: (a) 测试样品应在TMV范围内,但使用代表性替代品或标准品除外。 (b) 对于使用能力研究或功率/样本量研究的TMV,测试样本应来自代表实际设计和/或生产过程的标称构建。 (c) 对于使用量规R&R研究的TMV,测试样本可能包括标称、边界和外部- 规格单位。这将允许评估整个公差范围内的测试方法性能。 (d) 以一致的方式制备试样。 (e) 仔细标记和/或控制每个测试样本,以防止混淆和偏差。 4.3.3.7 第7步-执行验证协议: (1) 要执行TMV协议: (a) 完成设备和测试环境设置。 (b) 检查量规校准和/或验证(如适用)。 (c) 遵循TMV协议中规定的测试顺序。 (d) 同时记录测试结果。 (e) 保持每个测量结果对鉴定人和测试样品的可追溯性。 (f) 记录与方案的任何偏差,以供审查。 (g) 尽可能保留试样和其他相关信息,直到验证报告完成并获得批准。 (h) 如果在验证测试过程中确定需要对测试环境和设备/夹具设置进行任何调整,则应评估调整对后续测量的影响。 4.3.3.8 第8步-分析TMV结果: (1) TMV结果的分析应遵循行业公认的统计方法。应评估任何偏离TMV方案的影响对TMV结果的影响。 (2) 能力研究 (a) 使用适当的分布和/或转换来评估分布拟合并计算能力指数(单面规范的Ppk或Cpk,双面规范的Pp或Cp)。 (3) 量规R&R研究 (a) 对无损量规R&R研究进行量规R&R研究(交叉)分析。 (b) 运行量规R&R研究(嵌套)分析,以进行破坏性量规R&R研究。 (c) 如果量规R&R研究涵盖不同的产品/型号/标签,则使用最坏情况下的公差范围或最坏情况下的过程标准偏差来计算百分比P/T或百分比R&R。 (d) 如果测试输出只有单侧规格限制,则百分比P/T计算需要使用测试输出的历史平均值或量规R&R研究平均值(见上述百分比P/T定义)。如果使用历史平均值,应根据有效数据源(例如,发布的技术报告、设计验证报告或工艺验证报告中的数据)计算。只有当量规R&R研究平均值代表测试输出的标称平均值时,才能使用量规R&R研究平均值。 (4) 功率/样本量研究 (a) 使用95%置信水平计算功率和样本量。在TMV协议和/或报告中包含适当理由的情况下,可以使用不同的置信水平。建议使用的置信水平与失败相关的风险挂钩。例如,如果评估对产品功能至关重要的特征,则使用更高的置信水平,如95%。 (b) 如果适用,使用单侧替代假设进行功率和样本量计算。 (5) 如果已满足TMV验收标准定义的最低性能要求,则填写TMV报告,接受结果并填写一份TMV报告,其中包括: (a) 范围部分,明确确定TMV涵盖的产品/零件和/或测量范围; (b) 试验方法文件和其他相关文件的参考文件; (c) TMV使用的设备、仪器、固定装置等清单; (d) 与TMV协议的任何偏差,以及对验证结果的影响的适当理由和评估; (e) 与TMV验收标准相关的测试结果的统计分析(图形和/或数字输出); (f) TMV结论; (g) 将原始数据附在TMV报告的属性页上,或根据现场实践将原始数据位置参考纳入经批准的电子实验室笔记本(ELN)或数据存储系统中; (h) 如果一些原始数据被排除在统计分析之外,请包括适当的理由,解释为什么不包括这些数据以及为什么不保证重复验证(例如,发生了什么情况,为什么发生了这些情况等)。 ). (一) 如果TMV发生故障,包括对故障的调查和采取的纠正措施。 (j) 验证测试后和TMV报告发布前对测试方法进行的任何更改的说明。对于不需要重新验证测试的变更,应提供理由证明没有重新验证测试。
1.1 This guide provides information to clarify the process of validating packaging test methods specific for an organization utilizing them as well as through inter-laboratory studies (ILS), addressing consensus standards with inter-laboratory studies (ILS) and methods specific to an organization. 1.1.1 ILS discussion will focus on writing and interpretation of test method precision statements and on alternative approaches to analyzing and stating the results. 1.2 This document provides guidance for defining and developing validations for both variable and attribute data applications. 1.3 This guide provides limited statistical guidance; however, this document does not purport to give concrete sample sizes for all packaging types and test methods. Emphasis is on statistical techniques effectively contained in reference documents already developed by ASTM and other organizations. 1.4 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. 1.5 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee. ====== Significance And Use ====== 4.1 Addressing consensus standards with inter-laboratory studies (ILS) and methods specific to an organization. Test methods need to be validated in many cases, in order to be able to rely on the results. This has to be done at the organization performing the tests but is also performed in the development of standards in inter-laboratory studies (ILS), which are not substitutes for the validation work to be performed at the organization performing the test. 4.1.1 Validations at the Testing Organization— Validations at the test performing organization include planning, executing, and analyzing the studies. Planning should include description of the scope of the test method which includes the description of the test equipment as well as the measurement range of samples it will be used for, rationales for the choice of samples, the amount of samples as well as rationales for the choice of methodology. 4.1.2 Objective of ILS Studies— ILS studies (per E691 -14) are not focused on the development of test methods but rather with gathering the information needed for a test method precision statement after the development stage has been successfully completed. The data obtained in the interlaboratory study may indicate however, that further effort is needed to improve the test method. Precision in this case is defined as the repeatability and reproducibility of a test method, commonly known as gage R&R. For interlaboratory studies, repeatability deals with the variation associated within one appraiser operating a single test system at one facility whereas reproducibility is concerned with variation between labs each with their own unique test system. It is important to understand that if an ILS is conducted in this manner, reproducibility between appraisers and test systems in the same lab are not assessed. 4.1.3 Overview of the ILS Process— Essentially the ILS process consists of planning, executing, and analyzing studies that are meant to assess the precision of a test method. The steps required to do this from an ASTM perspective are; create a task group, identify an ILS coordinator, create the experimental design, execute the testing, analyze the results, and document the resulting precision statement in the test method. For more detail on how to conduct an ILS refer to E691 -14. 4.1.4 Writing Precision and Bias Statements— When writing Precision and Bias Statements for an ASTM standard, the minimum expectation is that the Standard Practice outlined in E177 -14 will be followed. However, in some cases it may also be useful to present the information in a form that is more easily understood by the user of the standard. Examples can be found in 4.1.5 below. 4.1.5 Alternative Approaches to Analyzing and Stating Results—Variable Data: 4.1.5.1 Capability Study: (1) A process capability greater than 2.00 indicates the total variability (part-to-part plus test method) of the test output should be very small relative to the tolerance. Mathematically, (2) Notice, σ Total in the above equation includes σ Part and σ TM . Therefore, two conclusions can be made: (a) The test method can discriminate at least 1/12 of the tolerance and hence the test method resolution is adequate Therefore, no additional analysis such as a Gage R&R Study is necessary. (b) The measurement is precise relative to the specification tolerance. (3) In addition, since the TMV capability study requires involvement of two or more operators utilizing one or more test systems, a high capability number will prove consistent test method performance across operators and test systems. 4.1.5.2 Gage R&R Study: (1) The proposed acceptance criteria below for %SV, %R&R, and %P/T came from the industry-wide adopted requirements for measurement systems. According to Automotive Industry Action Group (AIAG) Measurement System Analysis Manual (4th edition, p. 78), a test method can be accepted if the test method variation (σ TM ) counts for less than 30 percent of the total variation of the study (σ Total ). (2) This is equivalent to:A process capability greater than 2.00 indicates the total variability (part-to-part plus test method) of the test output should be very small relative to the tolerance. Mathematically, (3) When historical data is available to evaluate the variability of the process, we should also have: (4) For %P/T, another industry-wide accepted practice is to represent the population using the middle 99% of the normal distribution. 5 And ideally, the tolerance range of the output should be wider than this proportion. For a normally distributed population, this indicates: (5) The factor 5.15 in the above equation is the two-sided 99% Z-score of a normal distribution. Therefore: (6) In practice this means that a test method with up to 6% P/T reproducibility would be effective at assessing the P/T for a given design. 4.1.5.3 Power and Sample Size Study: (1) When comparing the means of two or more populations using statistical tests, excessive test method variability may obscure the real difference (“Signal”) and decrease the power of the statistical test. As a result, a large sample size may be needed to maintain an adequate power ( ≥ 80%) for the statistical test. When the sample size becomes too large to accept from a business perspective, one should improve the test method before running the comparative test. Therefore, an accept /reject decision on a comparative test method could be made based on its impact on the power and sample size of the comparative test (ex. 2 Sample T-test). 4.2 Attribute Test Method Validation: 4.2.1 Objective of Attribute Test Method Validation— Attribute test method validation (ATMV) demonstrates that the training and tools provided to inspectors enable them to distinguish between good and bad product with a high degree of success. There are two criteria that are used to measure whether an ATMV has met this objective. The primary criterion is to demonstrate that the maximum escape rate, β, is less than or equal to its prescribed threshold of βmax. The parameter β is also known as Type II error, which is the probability of wrongly accepting a non-conforming device. The secondary criterion is to demonstrate that the maximum false alarm rate, α, is less than or equal to its prescribed threshold of αmax. The parameter α is also known as Type I error, which is the probability of wrongly rejecting a conforming device. 4.2.2 Overview of the ATMV Process— This section describes how an ATMV typically works. In an attribute test method validation, a single, blind study is conducted that is comprised of both conforming and non-conforming units. The ATMV passes when the requirements of the both sampling plans are met. The first sampling plan demonstrates that the test method meets the requirements for the maximum allowable beta error (escape rate), and the second sampling plan demonstrates that the test method meets the requirements for the maximum allowable alpha error (false alarm rate). In other words, the test method is able to demonstrate that it accepts conforming units and rejects non-conforming units with high levels of effectiveness. The beta error sampling plan will consist entirely of nonconforming units. The total number of beta trials conducted by each inspector 6 are pooled together, and their total number of misclassifications (nonconforming units that were accepted) need to be less than or equal to the number of failures prescribed by the beta error sampling plan. The alpha error sampling plan will consist entirely of conforming units. The total number of alpha trials conducted by each inspector are pooled together, and their total number of misclassifications (conforming units that were rejected) need to be less than or equal to the number of failures prescribed by the alpha error sampling plan. 4.2.3 ATMV Examples— Attribute test methods cover a broad range of testing. Examples of these test method categories are listed in Table 1 . The right half of the table consists of test methods that return qualitative responses, and the left half of the table contains test methods that provide variable measurement data. 4.2.4 ATMV for Variable Measurement Data— It is a good practice to analyze variable test methods as variable measurement data whenever possible. However, there are instances where measurement data is more effectively treated as qualitative data. Example: A Sterile Barrier System (SBS) for medical devices with a required seal strength specification of 1.0-1.5 lb./in. is to be validated. A tensile tester is to be used to measure the seal strength, but it only has a resolution of 0.01 lbs. As a result, the Ppk calculations typically fail, even though there is very rarely a seal that is out of specification in production. The validation team determines that the data will need to be treated as attribute, and therefore, an ATMV will be required rather than a variable test method validation. 4.2.5 Self-evident Inspections— This section illustrates the requirements of a self-evident inspection called out in the definitions above. To be considered a self-evident inspection, a defect is both discrete in nature and requires little or no training to detect. The defect cannot satisfy just one or the other requirement. 4.2.5.1 The following may be considered self-evident inspections: (1) Sensor light illuminates when lubricity level on a wire is correct and otherwise does not light up when lubrication is insufficient – Since the test equipment is creating a binary output for the inspector and the instructions are simple, this qualifies as self-evident. However, note that a test method validation involving the equipment needs to be validated. (2) Component is present in the assembly – If the presence of the component is reasonably easy to detect, this qualifies as self-evident since the outcome is binary. (3) The correct component is used in the assembly – As long as the components are distinct from one another, this qualifies as self-evident since the outcome is binary. 4.2.5.2 The following would generally not be considered self-evident inspections: (1) Burn or heat discoloration – Unless the component completely changes color when overheated, this inspection is going to require the inspector to detect traces of discoloration, which fails to satisfy the discrete conditions requirement. (2) Improper forming of S-bend or Z-bend – The component is placed on top of a template, and the inspector verifies that the component is entirely within the boundaries of the template. The bend can vary from perfectly shaped to completely out of the boundaries in multiple locations with every level of bend in-between. Therefore, this is not a discrete outcome. (3) No nicks on the surface of the component – A nick can vary in size from “not visible under magnification” to “not visible to the unaided eye” to “plainly visible to the unaided eye”. Therefore, this is not a discrete outcome. (4) No burrs on the surface of a component – Inspectors vary in the sensitivity of their touch due to callouses on their fingers, and burrs vary in their degree of sharpness and exposure. Therefore, this is neither a discrete condition nor an easy to train instruction. (5) Component is cracked – Cracks vary in length and severity, and inspectors vary in their ability to see visual defects. Therefore, this is neither a discrete outcome nor an easy to train instruction. 4.2.6 ATMV Steps: 4.2.6.1 Step 1 – Prepare the test method documentation: (1) Make sure equipment qualifications have been completed or are at least in the validation plan to be completed prior to executing the ATMV. (2) Examples of equipment settings to be captured in the test method documentation include environmental or ambient conditions, magnification level on microscopes, lighting and feed rate on automatic inspection systems, pressure on a vacuum decay test and lighting standards in a cleanroom, which might involve taking lux readings in the room to characterize the light level. (3) Work with training personnel to create pictures of the defects. It may be beneficial to also include pictures of good product and less extreme examples of the defect, since the spectrum of examples will provide better resolution for decision making. (4) Where possible, the visual design standards should be shown at the same magnification level as will be used during inspection. (5) Make sure that the ATMV is run using the most recent visual design standards and that they are good representations of the potential defects. 4.2.6.2 Step 2 – Establish acceptance criteria: (1) Identify which defects need to be included in the test. (2) Use scrap history to identify the frequency of each defect code or type. This could also be information that is simply provided by the SME. (3) Do not try to squeeze too many defects into a single inspection step. As more defects are added to an inspection process, inspectors will eventually reach a point where they are unable to check for everything, and this threshold may also show itself in the ATMV testing. Limits will vary by the type of product and test method, but for visual inspection, 15-20 defects may be the maximum number that is attainable. 4.2.6.3 Step 3 – Determine the required performance level of each defect: (1) If the ATMV testing precedes completion of a risk analysis, the suggested approach is to use a worse-case outcome or high risk designation. This needs to be weighed against the increase in sample size associated with the more conservative rating. (2) Failure modes that do not have an associated risk index may be tested to whatever requirements are agreed upon by the validation team. If a component or assembly can be scrapped for a particular failure mode, good business sense is to make sure that the inspection is effective by conducting an ATMV. (3) Pin gages are an example of a variable output that is sometimes treated as attribute data due to poor resolution combined with tight specification limits. In this application, inspectors are trained prior to the testing to understand the level of friction that is acceptable versus unacceptable. (4) Incoming inspection is another example of where variable data is often treated as attribute. Treating variable measurements as pass/fail outcomes can allow for less complex measurement tools such as templates and require less training for inspectors. However, these benefits should be weighed against the additional samples that may be required and the degree of information lost. For instance, attribute data would say that samples centered between the specification limits are no different than samples just inside of the specification limits. This could result in greater downstream costs and more difficult troubleshooting for yield improvements. 4.2.6.4 Step 4 – Determine acceptance criteria: (1) Refer to your company’s predefined confidence and reliability requirements; or (2) Refer to the chart example in Appendix X1 . 4.2.6.5 Step 5 – Create the validation plan: (1) Determine the proportion of each defect in the sample. (a) While some sort of rationale should be provided for how the defect proportions are distributed in the ATMV, there is some flexibility in choosing the proportions. Therefore, different strategies may be employed for different products and processes, for example 10 defective parts in 30 or 20 defects in 30. The cost of the samples along with the risk associated with incorrect outcomes affects decision making. (b) Scrap production data will often not be available for new products. In these instances, use historical scrap from a similar product or estimate the expected scrap proportions based on process challenges that were observed during development. Another option is to represent all of the defects evenly. 4.2.6.6 Step 6 – Determine the number of inspectors and devices needed: (1) When the number of trials is large, consider employing more than three inspectors to reduce the number of unique parts required for the test. More inspectors can inspect the same parts without adding more parts to achieve additional trials and greater statistical power. (2) Inspectors are not required to all look at the same samples, although this is probably the simplest approach. (3) For semi-automated inspection systems that are sensitive to fixture placement or setup by the inspector, multiple inspectors should still be employed for the test. (4) For automated inspection systems that are completely inspector independent, only one inspector is needed. However, in order to reduce the number of unique parts needed, consider controlling other sources of variation such as various lighting conditions, temperature, humidity, inspection time, day/night shift, and part orientations. 4.2.6.7 Step 7 – Prepare the Inspectors: (1) Train the inspectors prior to testing: (a) Explain the purpose and importance of ATMV to the inspectors. (b) Inspector training should be a two-way process. The validation team should seek feedback from the inspectors on the quality and clarity of visual standards, pictures and written descriptions in the inspection documentation. (1) Are there any gray areas that need clarification? (2) Would a diagram be more effective than an actual picture of the defect? (c) Review borderline samples. Consider adding pictures/diagrams of borderline samples to the visual standards. In some cases there may be a difference between functional and cosmetic defects. This may vary by method/package type. (d) Some validation teams have performed dry run testing to characterize the current effectiveness of the inspection. Note that the same samples should not be used for dry run testing and final testing if the same inspectors are involved in both tests. 4.2.6.8 Step 8 – Select a representative group of inspectors as the test group: (1) There will be situations, such as site transfer, where all of the inspectors have about the same level of familiarity with the product. If this is the case, select the test group of inspectors based on other sources of variability within the inspectors, such as their production shift, skill level or years of experience with similar product inspection. (2) The inspectors selected for testing should at least have familiarity with the product, or this becomes an overly conservative test. For example, a lack of experience with the product may result in an increase in false positives. (3) Document that a varied group of inspectors were selected for testing. 4.2.6.9 Step 9 – Prepare the Test Samples: (1) Collect representative units. (a) Be prepared for ATMV testing by collecting representative defect devices early and often in the development process. Borderline samples are particularly valuable to collect at this time. However, be aware that a sample that cannot even be agreed upon as good or bad by the subject matter experts is only going to cause problems in the testing. Instead, choose samples that are representative of “just passing” and “just failing” relative to the acceptance criteria. (2) Use the best judgment as to whether the man-made defect samples adequately represent defects that naturally occur during the sealing process, distribution simulation, or other manufacturing processes, for example. If a defect cannot be adequately replicated and/or the occurrence rate is too low to provide a sample for the testing, this may be a situation where the defect type can be omitted with rationale from the testing. (3) Estimate from a master plan how many defects will be necessary for testing, and try to obtain 1.5 times the estimated number of samples required for testing. This will allow for weeding out broken samples and less desirable samples. (4) Traceability of samples may not be necessary. The only requirement on samples is that they accurately depict conformance or the intended nonconformance. However, capturing traceability information may be helpful for investigational purposes if there is difficulty validating the method or if it is desirable to track outputs to specific non-conformities. (5) There should preferably be more than one SME to confirm the status of each sample in the test. Keep in mind that a trainer or production supervisor might also be SMEs on the process defect types. (6) Select a storage method appropriate for the particular sample. Potential options include tackle boxes with separate labeled compartments, plastic resealable bags and plastic vials. Refer to your standardized test method for pre-conditioning requirements. (7) Writing a secret code number on each part successfully conceals the type of defect, but it is NOT an effective means of concealing the identity of the part. In other words, if an inspector is able to remember the identification number of a sample and the defect they detected on that sample, then the test has been compromised the second time the inspector is given that sample. If each sample is viewed only once by each inspector, then placing the code number on the sample is not an issue. (8) Video testing is another option for some manual visual inspections, especially if the defect has the potential to change over time, such as a crack or foreign material. (9) If the product is extremely long/large, such as a guidewire, guide catheter, pouch, tray, container closure system (jar & lid), and the defects of interest are only in a particular segment of the product, one can choose to detach the pertinent segment from the rest of the sample. If extenuating factors such as length or delicacy is an element in making the full product challenging to inspect, then the full product should be used. Example: leak test where liquid in the package that could impact the test result. (10) Take pictures or videos of samples with defects and store in a key for future reference. 4.2.6.10 Step 10 – Develop the protocol: (1) Suggested protocol sections (a) Purpose and scope. (b) Reference to the test method document being validated. (c) A list of references to other related documents, if applicable. (d) A list of the types of equipment, instruments, fixtures, etc. used for the TMV. (e) TMV study rationale, including: (1) Statistical method used for TMV; (2) Characteristics measured by the test method and the measurement range covered by the TMV; (3) Description of the test samples and the rationale; (4) Number of samples, number of operators, and number of trials; (5) Data analysis method, including any historical statistics that will be used for the data analysis (for example, the historical average for calculating %P/T with a one-sided specification limit). (f) TMV acceptance criteria. (g) Validation test procedures (for example, sample preparation, test environment setup, test order, data collection method, etc.). (h) Methods of randomization (1) There are multiple ways to randomize the order of the samples. In all cases, store the randomized order in another column, then repeat and append the second randomized list to the first stored list for each sample that is being inspected a second time by the same inspector. (2) Consider using Excel, Minitab, or an online random number generator to create the run order for the test. (3) Draw numbers from a container until the container is empty and record the order. (i) Some companies apply time limits to each sample or a total time limit for the test so that the testing is more representative of the fast-paced requirements of the production environment. If used, this should be noted in the protocol. 4.2.6.11 Step 11 – Execute the protocol: (1) Be sure to comply with the pre-conditioning requirements during protocol execution. (2) Avoid situations where the inspector is hurrying to complete the testing. Estimate how long each inspector will take and plan ahead to start each test with enough time in the shift for the inspector to complete their section, or communicate that the inspector will be allowed to go for lunch or break during the test. (3) Explain to the inspector which inspection steps are being tested. Clarify whether there may be more than one defect per sample. However, note that more than one defect on a sample can create confusion during the testing. (4) If the first person fails to correctly identify the presence or absence of a defect, it is a business/team decision on whether to continue the protocol with the remaining inspectors. Completing the protocol will help characterize whether the issues are widespread, which could help avoid failing again the next time. On the other hand, aborting the ATMV right away could save considerable time for everyone. (5) It is not good practice to change the sampling plan during the test if a failure occurs. 7 For instance, if the original beta error sampling plan was n=45, a=0, and a failure occurs, updating the sampling plan to an n = 76, a=1 during the test is incorrect since the sampling plan being performed is actually a double sampling plan with n1=45, a1=0, r1=2, n2=31, a1=1. This results in an LTPD = 5.8%, rather than the 5.0% LTPD in the original plan. (6) Be prepared with replacement samples in reserve if a defect sample becomes damaged. (7) Running the test concurrently with all of the test inspectors is risky, since the administrator will be responsible for keeping track of which inspector has each unlabeled sample. (8) Review misclassified samples after each inspector to determine whether the inspector might have detected a defect that the prep team missed. 4.2.6.12 Step 12 – Analyze the test results: (1) Scrapping for the wrong defect code or defect type: (a) There will be instances where an inspector describes a defect with a word that wasn’t included in the protocol. The validation team needs to determine whether the word used is synonymous with any of the listed names for this particular defect. If not, then the trial fails. If the word matches the defect, then note the exception in the deviations section of the report. (2) Excluding data from calculations of performance: (a) If a defect is discovered after the test is complete, there are two suggested options. First, the inspector may be tested on a replacement part later if necessary. Alternatively, if the results of the individual trial will not alter the final result of the sampling plan, then the replacement trials can be bypassed. This rationale should be documented in the deviations section of the report. (1) As an example, consider an alpha sampling plan of n = 160, a = 13 that is designed to meet a 12% alpha error rate. After all inspectors had completed the test, it was determined that one of the conforming samples had a defect, and five of the six trials on this sample identified this defect, while one of the six called this a conforming sample. The results of the six trials need to be scratched, but do they need to be repeated? If the remaining 154 conforming trials have few enough failures to still meet the required alpha error rate of 12%, then no replacement trials are necessary. The same rationale would also apply to a defective sample in a beta error sampling plan. (2) If a vacuum decay test sample should have failed the leak test, in that case as part of the protocol the process may be to send the sample back to the company that created the defective sample for confirmation that it is indeed still defective. If found to no longer be representative of the desired defect type, then the sample would be excluded from the calculations. 4.2.6.13 Step 13 – Complete the validation report: (1) When the validation test passes: (a) If the ATMV was difficult to pass or it requires special inspector training, consider adding an appraiser proficiency test to limit those who are eligible for the process inspection. (2) When the validation test fails: (a) Repeating the validation (1) There is no restriction on how many times an ATMV can fail. However, some common sense should be applied, as a high number of attempts appear to be a test-until-you-pass approach and could become an audit flag. Therefore, a good rule of thumb is to perform a dry run or feasibility assessment prior to execution to optimize appraiser training and test methodology in order to reduce the risk of failing the protocol. If an ATMV fails, members of the validation team could take the test themselves. If the validation team passes, then something isn’t being communicated clearly to the inspectors, and additional interviews are needed to identify the confusion. If the validation team also fails the ATMV, this is a strong indication that the visual inspection or attribute test method is not ready for release. (b) User Error (1) Examples of ATMV test error include: (a) Microscope set at the wrong magnification. (b) Sample traceability compromised during the ATMV due to a sample mix-up. (2) A test failure demonstrates that the variability among inspectors needs to be reduced. The key is to understand why the test failed, correct the issue and document rationale, so that subsequent tests do not appear to be a test-until-you-pass approach. (3) As much as possible, the same samples should not be used for the subsequent ATMV if the same inspectors are being tested that were in the previous ATMV. (4) Interview any inspectors who committed classification errors to understand if their errors were due to a misunderstanding of the acceptance criteria or simply a miss. (5) To improve the proficiency of defect detection/test methodology the following are some suggested best practices: (a) Define an order of inspection in the work instruction for the inspectors, such as moving from proximal end to distal end or doing inside then outside. (b) When inspecting multiple locations on a component or assembly for specific attributes, provide a visual template with ordered numbers to follow during the inspection. (c) Transfer the microscope picture to a video screen for easier viewing. (d) If there are too many defect types to look for at one inspection step, some may get missed. Move any inspections not associated with the process back upstream to the process that would have created the defect. (6) When an inspector has misunderstood the criteria, the need is to better differentiate good and nonconforming product. Here are some ideas: (a) Review the visual standard of the defect with the inspectors and trainers. (b) Determine whether a diagram might be more informative than a photo. (c) Change the magnification level on the microscope. (d) If an ATMV is failing because borderline defects are being wrongly accepted, slide the manufacturing acceptance criteria threshold to a more conservative level. This will potentially increase the alpha error rate, which typically has a higher error rate allowance anyway, but the beta error rate should decrease. (7) Consider using an attribute agreement analysis to help identify the root cause of the ATMV failure as it is a good tool to assess the agreement of nominal or ordinal ratings given by multiple appraisers. The analysis will calculate both the repeatability of each individual appraiser and the reproducibility between appraisers, similar to a variable gage R&R. 4.2.6.14 Step 14 – Post-Validation Activities: (1) Test Method Changes (a) If requirements, standards, or test methods change, the impact of the other two factors needs to be assessed. (b) As an example, many attribute test methods such as visual inspection have no impact on the form, fit or function of the device being tested. Therefore, it is easy to overlook that changes to the test method criteria documented in design prints, visual design standards, visual process standards need to be closely evaluated for what impact the change might have on the performance of the device. (c) A good practice is to bring together representatives from operations and design to review the proposed change and consider potential outcomes of the change. (d) For example, changes to the initial visual inspection standards that were used during design verification builds may not identify defects prior to going through the process of distribution simulation. Stresses that were missed during this initial inspection may be exacerbated by exposure to shock, vibration, thermal cycling associated with the distribution simulation process. Thus, it’s important to understand the impact that changes to the visual standards used upstream may have on downstream inspections. (2) Augmented Test Method Validation—Sometimes a new defect is identified after the ATMV has already been validated. There are a variety of ways to validate detection of the new failure mode. (a) Option #1 – Repeat the entire validation with the addition of the new criterion. (1) Advantages: The end result is a complete, stand-alone validation that completely represents the final inspection configuration. (2) Disadvantages: This is an excessive level of work that amounts to revalidation of the entire ATMV, all for the addition of a single inspection criterion. Furthermore, if the ATMV fails for one of the pre-existing failure modes, this brings the validated ATMV into question, as well as historical production devices that were approved by this test method. (b) Option #2 – Run a fully powered ATMV with only the defect associated with the new criterion. (1) Advantages: Since the ability to detect the other defects has already been validated, this approach keeps the focus on the newly introduced criterion. (2) Disadvantages: the distribution of defect types in the test samples should be based on a proportional historical scrap representation. Therefore, an ATMV comprised of only the one defect would completely overwhelm the sample sizes of other defect codes in the original ATMV. Secondly, if there are a limited number of inspectors available, this quickly becomes a burdensome effort to find enough nonconforming samples. (3) Note—Consider doing a risk-based analysis on the defect type in order to determine the appropriate amount of samples. (c) Option #3 – Run an augmented ATMV with only the new defect at a smaller sample size. (1) Advantages: This approach combines the advantages of the first two options without the drawbacks. The focus is on the defect with the new inspection criteria but at sample sizes that sufficiently exercise the knowledge of the inspectors. The augmented report can point back at the original ATMV and the updated MVR, so that future ATMVs are aware of the new defect to include in future reports. (2) Disadvantages: One could argue that the inspectors are not tested on a mix of new and old defects at the same time. However, the inspectors are still presented a mix of conforming and nonconforming parts in order to evaluate both alpha and beta error rates, so an augmented approach still challenges the inspectors’ understanding of the visual requirements. (d) Option #4 – Run an augmented ATMV based on additional process knowledge gained over time. (1) Advantages: May reduce sample size because more data has been collected over time that better allows you to understand the true occurrence rate of defects. (2) Disadvantages: May increase sample sizes you have more data regarding the occurrence rate for defects. (3) Leveraging of Attribute Test Methods—Leveraging is a smart, efficient approach to use when it’s appropriate to do so. In particular, leveraging should be considered when the assembly being inspected is identical or sufficiently similar to a previously validated assembly. This might occur when an assembly is being used in a next-generation device or when a product is being transferred to another site. (a) Suggested ATMV requirements for leveraging: (1) The same tools, inspection criteria, and training methods are used. For instance, microscopes should be set at the same magnification level. If the inspection is being relaxed or tightened for the new device, then the ATMV should be repeated. (2) Minimum performance requirements still need to be met for the new product. So if the defect on the old product was rated as Low Risk Index but the same defect on the new product is considered to be a Medium or High Risk, then the existing ATMV cannot be leveraged unless the original ATMV was tested to High Risk requirement levels. (3) There cannot be any changes or “improvements” to the inspection process or acceptance criteria. One man’s clarification is another’s confusion. (4) A protocol is not necessarily required when leveraging; only a report is needed. The message here is that there should be a consistent way of documenting that the test method has been leveraged, but that this activity doesn’t require the protocol to be touched. Instead, just the original test method report can be updated to say that an additional product, production line or site is also using this validated test method. 4.3 Variable Test Method Validation: 4.3.1 Objective of variable test method validation: 4.3.1.1 A variable test method should be validated before assuming the test results are accurate and precise. Test method variation exists in all testing and should be assessed for its impact on the test results and/or product specification tolerance. The purpose of running variable test method validation is to: (1) Demonstrate that the precision of the test method is adequate for the test characteristic being measured. (2) Provide objective evidence of consistent operation of test equipment and consistent performance of test procedures. 4.3.1.2 The variable TMV covers the precision requirement for using a gage or test equipment. Although accuracy of the gage or test equipment is not explicitly defined in this document, it should be evaluated per the calibration procedure and equipment qualification process of your company or the equipment manufacturer’s recommendations. The successful outcome of a variable TMV, combined with the completion of calibration and equipment qualification, will ensure that the test method (including test equipment) makes accurate and precise measurements. 4.3.2 Test Method Variation: 4.3.2.1 In general, observed test results include two sources of variation: part-to-part variation and test method variation. The relationship between the observed, part-to-part, and test method variation can be described mathematically: Or, graphically as shown in Fig. 1 . (1) Capability Study (a) In the absence of a statistically driven sample size defer to collecting a minimum of n=30 measurement data points from two or more appraisers. (b) Existing data can be used if it contains a minimum of 30 samples and the TMV protocol is not required in this case. However, reference to the original data sources should be included in the TMV report. (c) If a test method is to be performed by a single designated appraiser, it is permissible to use data solely from this appraiser to meet the n=30 sample size requirement. However, the test method document and test method validation report should clearly state this limitation. In order to add additional trained appraisers subsequently to the test method, reassessment of the data from the original appraisers is required, followed by revision of the test method document and test method validation report. (2) Gage Repeatability and Reproducibility Study (a) A gage repeatability and reproducibility (Gage R&R) study is a statistical method to: (1) Estimate the repeatability, reproducibility and total variability of a test method; (2) Assess whether the precision of a test method is adequate relative to the tolerance range of the parts or products being measured, or to the overall process variation or the total study variation. (b) Sample Size of a Gage R&R Study (1) The goal of a Gage R&R study is to estimate σ repeatability and σ reproducibility . Sample size does impact the standard deviations (or variance) estimation: Sample Size Influence # of parts σ repeatability and σ reproducibility # of operators σ reproducibility # of trials σ repeatability (2) Generally speaking, the higher the sample size, the more accurate the standard deviation estimation. This is because the sample standard deviation statistic follows a chi-square distribution with the degrees of freedom related to the sample size. A larger sample size (higher degrees of freedom) will lead to a chi-square distribution with less spread and therefore a narrower confidence interval for the standard deviation. (3) A detailed mathematical assessment of Gage R&R sample size described by reference book Design and Analysis of Gage R&R 8 allows us to reach the following conclusions: (# of parts) × (# of operators) × (# of trials - 1 ) ≥ 15. 3 or more operators in a Gage R&R study is recommended to reduce the risk of getting an inflated estimation of the reproducibility, which has a tighter threshold. (c) Non-destructive Test Method (1) Use a minimum of 3 test samples for a Gage R&R study. If the scope of the TMV includes multiple products/models/tabs, use one of the following three approaches to select samples for testing: (a) Use samples of the product/model/tab that provide the greatest challenge toward the TMV acceptance criteria (for example, the product that has high test method variability but the tightest tolerance range.) (b) Use the bracketing technique. Run two or more Gage R&R studies, each for a product/model/tab that represents a possible extreme condition. Using this strategy is necessary to adequately assure the test method performance across the whole spectrum bracketed by those extreme conditions. (c) Pool samples of different products/models/tabs in one gage R&R study. This approach applies when the products/models/tabs being pooled have the same tolerance, or the test method variation across different products/models/tabs can be assumed comparable. (2) Select at least 2 appraisers from a pool of all qualified test method appraisers at random or covering a broad range of experience levels. (3) Follow Table 3 to determine the minimum number of trials (repeated readings) per each appraiser and each test sample. (4) Randomize the test order and prepare a datasheet if necessary. (d) Destructive Test Method (1) For a destructive test method, it is difficult to separate the test method repeatability from the part-to-part variation. However, both the test method repeatability and reproducibility can still be assessed in a destructive Gage R&R study. Four methods can be used: (a) Use surrogates. A surrogate is a substitute for the actual product/part that is representative of the test method intended use. The surrogate can either be measured repeatedly as a non-destructive test, or the surrogate’s part-to-part variability might be small enough compared to the repeatability component of the measurement error that it has a minimal impact on the total measurement error. To use a surrogate, the description of the surrogates and justification of their representativeness of the test method intended use should be included in the TMV protocol. The number of surrogate samples, number of appraisers, and number of trials should meet the requirement in Table 3 . For example, using magnets of different strengths to represent the range of seal strength to be studied is a possible idea. (b) Use master groups. A master group (also known as master batch, master lot) is a collection of homogenous units with small part-to-part variation (for example, units taken from a single manufacturing batch). In a Gage R&R study, each master group will be treated as a hypothetical “test part” and every unit within the master group will be treated as a trial. Notice that under this method, the estimated repeatability will include not only the true test method repeatability, but also the part-to-part variation within the master groups. To use the master group method for a Gage R&R study, the master groups should be clearly defined and documented in the TMV protocol. The number of master groups, number of appraisers, and number of trials are suggested in Table 3 . (c) If the master group homogeneity cannot be presumed, a Gage R&R analysis can still be run to estimate the test method reproducibility using the master group method. However, the test method repeatability will be overly estimated by including a large part-to-part variation within the master groups. In this case, one can run a separate study using surrogates, standards, or other techniques to estimate the test method repeatability. The total standard deviation of the test method will be calculated according to the Root Sum Square (RSS) 9 method: (d) In some cases, when measurement of a test sample is not completely repeatable, changes in subsequent measurements can be characterized using engineering or statistical models. These models may help to estimate the true test method repeatability. Using this method often involves advanced scientific or statistical methods and users should consult with a SME for assistance and approval. Note 1: For commonly used destructive packaging test methods, like ASTM F88 Heat Seal Peel Testing, it is important to understand that when doing a test method validation on a tensile tester (the instrument for this method), one should be concerned with the variation in the method and not the material. Thus, if possible try to use materials with as little variation as possible that still cover the range of force that will be tested. (3) Power/Sample Size Study—A Power/Sample Size study only applies to validating a comparative test method that is used for comparing the means of populations using statistical tests (for example, 1-sample t test, 2-sample t test, or ANOVA). The purpose of running a Power/Sample Size study is to ensure that the variability of the test method does not obscure the difference of means (the “signal”) that should be detected by the statistical test with an adequate power ( ≥ 80%) 10 . (a) To conduct a Power/Sample Size study: (1) Establish the practical significance Δ for comparing the means and document the Δ with rationale in the TMV protocol and/or report. (2) Estimate the total standard deviation based on a minimum of n=15 measurements. These measurements should be collected from 5 or more unique samples. (3) Calculate the required sample size n for the statistical test to detect Δ with ≥ 80% power. (4) Accept the test method if the sample size n is acceptable. (4) Use Existing Data—Existing data could be used for variable TMV, if it met the following requirements: (a) Existing data should be stored in a controlled manner with full traceability to its original sources (for example, data stored in DHF; in an approved electronic lab notebook (ELN); or as an attachment to an approved change controlled document). (b) Existing data should be collected from measuring products or parts that are within the validated range of the test method, or representative of the test method’s intended use. (c) The TMV report could include the rationale on how existing data used meet these requirements. 4.3.3.4 Step #4 – Complete TMV Protocol: (1) Except when using the “Report-only” approach, a released TMV protocol is required per ISO 11607 prior to executing the validation testing. VTMV protocol could include, at minimum, the following contents: (a) Purpose and scope of the TMV protocol. (b) Reference to the test method document being validated. (c) A list of references to other related documents, if applicable. (d) A list of the types of equipment, instruments, fixtures, etc. used for the TMV. (e) TMV study rationale, including: (1) Statistical method used for TMV; (2) Characteristics measured by the test method and the measurement range covered by the TMV; (3) Description of the test samples and the rationale; (4) Number of samples, number of appraisers, and number of trials; (5) Data analysis method, including any historical statistics that will be used for the data analysis (for example, the historical average for calculating %P/T with a one-sided specification limit). (f) TMV acceptance criteria. (g) Validation test procedures (for example, sample preparation, test environment setup, test order, data collection method, etc.). 4.3.3.5 Step #5 – Prepare Appraisers: (1) Select the required number of appraisers from a pool of all qualified test method appraisers, covering a broad range of skills and experience levels. (2) Train appraisers on the test method prior to executing the TMV protocol. Training should include any special setup of the test environment and operation of equipment unique to the TMV for which the appraisers will be responsible prior to running the test method. Training should be documented per company training requirements. 4.3.3.6 Step #6 – Prepare Test Samples: (1) Collect the required number of test samples and prepare them for validation testing: (a) The test samples should be within the TMV scope, except for using representative surrogates or standards. (b) For TMV using the Capability Study or Power/Sample Size study, the test samples should be from a nominal build that represents the actual design and/or production process. (c) For TMV using a Gage R&R study, the test samples may include nominal, borderline, and out-of-specification units. This will allow for assessment of the test method performance across the tolerance range. (d) Prepare the test samples in a consistent manner. (e) Label and/or control every test sample carefully to prevent mix-up and bias. 4.3.3.7 Step #7 – Execute Validation Protocol: (1) To execute the TMV protocol: (a) Complete equipment and test environment setup. (b) Check the gage calibration and/or verification if applicable. (c) Follow the test order as specified in the TMV protocol. (d) Record the test results concurrently. (e) Maintain traceability of every measurement result to the appraiser and test sample. (f) Document any deviations from the protocol for review. (g) Retain the test samples and other relevant information, whenever possible, until the validation report is completed and approved. (h) If any adjustment of the test environment and equipment/fixture setup is identified as needed in the middle of validation testing, impact of the adjustment on subsequent measurements should be evaluated. 4.3.3.8 Step #8 – Analyze TMV Results: (1) Analysis of TMV results should follow industry recognized statistical methods. The effect of any deviations from the TMV protocol should be assessed for impact on the TMV results. (2) Capability Study (a) To assess distribution fit and to calculate the capability index (Ppk or Cpk for 1-sided specification, Pp or Cp for 2-sided specifications) using the appropriate distribution and/or transformation. (3) Gage R&R Study (a) Run Gage R&R Study (Crossed) analysis for a non-destructive Gage R&R study. (b) Run Gage R&R Study (Nested) analysis for a destructive Gage R&R study. (c) If the Gage R&R study covers different products/models/tabs, use the worst case tolerance range or the worst case process standard deviation to calculate %P/T or %R&R. (d) If the test output has only a one-sided specification limit, the %P/T calculation needs to use either a historical average or the Gage R&R study average of the test output (see %P/T definition above). The historical average, if used, should be calculated from a valid data source (for example, the data from a released technical report, design verification report, or process validation report). The Gage R&R study average can only be used when it is representative of the test output’s nominal mean. (4) Power/Sample Size Study (a) Use a 95% confidence level for power and sample size calculation. A different confidence level can be used with appropriate rationale included in the TMV protocol and/or report. It is suggested that the confidence level used be tied to the risk associated with failure. For example if assessing a feature critical to the function of a product then use a higher confidence level like 95%. (b) Use the one-sided alternative hypothesis for the power and sample size calculation if applicable. (5) Complete TMV Report—If the minimum performance requirements have been met as defined by the TMV acceptance criteria, accept the results and complete a TMV report that includes: (a) A scope section that clearly identifies the products/parts and/or the range of measurements that are covered by the TMV; (b) References to the test method document and other related documents; (c) A list of equipment, instruments, fixtures, etc. used for the TMV; (d) Any deviations from the TMV protocol with appropriate rationales and evaluation of the effect on the validation results; (e) Statistical analysis (graphical and/or numerical outputs) of the test results relevant to the TMV acceptance criteria; (f) A TMV conclusion; (g) Attachment of the raw data to the properties page of the TMV report, or include the reference of the raw data location in an approved electronic lab notebook (ELN) or data storage system per site practice; (h) If some raw data are excluded from the statistical analyses, include appropriate rationale explaining why these data will not be included and why a repeat validation is not warranted (for example, what were the conditions, why did they occur, etc.). (i) If a TMV fails, include the investigation of the failure and corrective actions taken. (j) Description of any changes to be made to the test method following validation testing and prior to release of the TMV report. For changes not requiring re-validation testing, rationale should be provided justifying no revalidation testing.
分类信息
关联关系
研制信息
归口单位: F02.50
相似标准/计划/法规
现行
ASTM D5663-15(2020)
Standard Guide for Validating Recycled Content in Packaging Paper and Paperboard
用于验证包装纸和纸板中回收内容的标准指南
2020-11-01
现行
ASTM E2857-22
Standard Guide for Validating Analytical Methods
验证分析方法的标准指南
2022-04-01
现行
ASTM D4821-20
Standard Guide for Carbon Black—Validation of Test Method Precision and Bias
碳黑验证标准指南测试方法精度和偏差
2020-11-01
现行
ASTM F3321-19
Standard Guide for Methods of Extraction of Test Soils for the Validation of Cleaning Methods for Reusable Medical Devices
可重复使用医疗器械清洁方法验证用试验土壤提取方法的标准指南
2019-10-01
现行
ASTM F3208-20
Standard Guide for Selecting Test Soils for Validation of Cleaning Methods for Reusable Medical Devices
选择测试土壤以确认可重复使用医疗设备清洁方法的标准指南
2020-07-15
现行
ASTM F3293-18
Standard Guide for Application of Test Soils for the Validation of Cleaning Methods for Reusable Medical Devices
标准指南 用于验证可重复使用医疗器械清洁方法的测试土壤
2018-05-01
现行
UNE 49705-2002
Packaging. Packaging for the transport of fruits and vegetables. Guide of test methods.
包装材料用于运输水果和蔬菜的包装 试验方法指南
2002-12-26
现行
ASTM E2898-20a
Standard Guide for Risk-Based Validation of Analytical Methods for PAT Applications
PAT应用分析方法风险验证标准指南
2020-06-01
现行
ASTM D4919-23
Standard Guide for Testing of Hazardous Materials (Dangerous Goods) Packagings
危险材料(危险货物)包装试验的标准指南
2023-03-15
现行
ASTM D8282-19
Standard Practice for Laboratory Test Method Validation and Method Development
实验室试验方法验证和方法发展的标准实施规程
2019-09-01
现行
ASTM E2918-23
Standard Test Method for Performance Validation of Thermomechanical Analyzers
热机械分析仪性能验证的标准试验方法
2023-08-01
现行
ASTM D8409-21
Standard Guide for Conducting Stacking Tests on UN Packagings Using Guided or Unguided Loads
使用导向或非导向负载在非包装上进行堆叠试验的标准指南
2021-10-15
现行
ASTM A700-14(2019)
Standard Guide for Packaging, Marking, and Loading Methods for Steel Products for Shipment
钢铁产品的装运 标记和装载方法标准指南
2019-11-01
现行
ASTM E3116-23
Standard Test Method for Viscosity Measurement Validation of Rotational Viscometers
旋转粘度计粘度测量验证的标准试验方法
2023-06-01
现行
ETSI EG 202 107
Methods for Testing and Specification (MTS); Planning for validation and testing in the standards-making process
测试和规范方法(MTS);标准制定过程中的验证和测试规划
1999-05-01
现行
ASTM D7660-20
Standard Guide for Conducting Internal Pressure Tests on United Nations (UN) Packagings
对联合国(UN)包装进行内部压力测试的标准指南
2020-04-01
现行
ASTM D1849-95(2019)
Standard Test Method for Package Stability of Paint
涂料包装稳定性的标准试验方法
2019-04-01
现行
ASTM F3438-24
Standard Guide for Detection and Quantification of Cleaning Markers (Analytes) for the Validation of Cleaning Methods for Reusable Medical Devices
可重复使用医疗器械清洁方法验证用清洁标记物(分析物)的检测和定量的标准指南
2024-03-15
现行
ETSI ETR 304
Methods for Testing and Specification (MTS); The future in ETSI of quality of standards-making, validation and testing
测试和规范方法(MTS);ETSI标准制定、验证和测试质量的未来
1996-12-01
现行
YY/T 0681.1-2018
无菌医疗器械包装试验方法 第1部分:加速老化试验指南
Test methods for sterile medical device package—Part 1:Test guide for accelerated aging
2018-12-20