Abstract— A declared purpose of the BPMN standard was to provide a business process modeling language, amenable of being used for process modelers regardless of their technical background. This aim was intended to be achieved by extensive documentation of the syntax rules of the notation, as well as by proposed best-practices for process modeling from practitioners. The wide acceptance of BPMN standard seems to accomplished the mentioned purpose, namely when considering its usage in business oriented process documentation and improvement scenarios, as well as in IT implementation of process models supported by software tools. However, a relevant question may raise regarding the correctness of business process models produced by modelers with different profiles. This issue is important since the conformance of produced process models to the syntax rules of the language determines the quality of the modeling process whatever its purpose is. Therefore, the main purpose of this work was to gather statistical evidence that could validate the assertion that, irrespective of the technical profile of people involved in producing BPMN models, they have the same level of correctness. This paper gathered evidence regarding this issue, by conducting a between-groups empirical study with process modelers with business-oriented and IT-oriented profiles.
Keywords – Business process modeling; Visual languages; BPMN; empirical study
The BPMN 2 standard , published by the Object Management Group (OMG), has, as one of its main purposes, to provide a notation understandable by different kinds of process modelers and users: (1) business analysts that sketch the initial documentation of business processes; (2) process developers which are responsible for actually implementing business processes; (3) business users which are accountable for business processes’ instantiation and monitoring.
The requirement for accurate specification of process models is relevant for several activities in the organization, namely when delivering processes’ specifications to meet regulatory and legal conditions (e.g. SOX, BASEL III), as well as for analysis, design and development of process-aware information systems , service-oriented architectures , and web services alike systems .
Although BPMN can be used for all these purposes, the business and technical models produced are quite different in nature. The main focus of BPMN business models, for documentation purposes, is on comprehension of basic process flow. Hence, the emphasis on the happy path, avoiding depicting excessive details . Exception handling and abnormal situations are often bypassed. On other hand, execution capabilities of the BPMN language are relevant when IT-oriented models are produced. Process developers require translation of BPMN models into machine readable languages either for models’ sharing across multiple domains, using different technologies, as well as enactment in distributed environment (e.g. integrating BPEL and web services standards). So, BPMN as a process modeling language joins different levels of abstractions. These perspectives encompass different sorts of constructs either for models’ static graphical representation, as properties for providing information for processes’ simulation and execution. However, both kinds of BPMN modelers have to ensure that resulting process models are syntactically correct and well-formed. To validate the suitability of BPMN to be used by distinct profiles of process modelers, as stated by OMG documentation, empirical evidence should be collected.
In this paper the authors reported an experiment conducted with two kind of process modelers (business-oriented and IT-oriented modelers), in order to assess the adequacy of BPMN as modeling language irrespective of technical background of the modeler. To gather statistical evidence regarding this issue, a between-groups empirical study was conducted with process modelers with two different profiles.
This work is structured as follow. In Section 2 the applied research method is detailed. The empirical study is described in Section 3, by going through the study’s definition (Section 3.1), planning (Section 3.2), execution (Section 3.3), data analysis (Section 3.4), and results analysis (Section 3.5). In Section 4 previous related work on process modeling is outlined. Finally Section 5 concludes this paper and suggest future work.
The research method
The process of making science requires that activities should be replicable by others acting under the same conditions. Thus, the selection of a research method is crucial for drawing grounded conclusions about a phenomenon. The research method determines the resources needed for the study, as well as the analysis’ perspective of the factors and causes that shape the phenomenon. In this work we chose a research method based in the scientific method. The applied research method was customized to the context of the empirical study, based on frameworks of Experimental Software Engineering field [6, 7]. Therefore, the following main activities were proposed to be developed in the context of the current work:
- Definition – in this activity the addressed research problem is stated alongside with the formulation of the research question concerning expected result on modelers’ produced BPMN models, the objective of the experiment, as well as the context in which the experiment will be carried out (Section III.A);
- Planning – this set of activities is about how the experiment will be performed, which involves a more detailed decisions concerning the context of the experiment. The formulation of the hypothesis under study is the basis for the experiment’s design, in order to answer the initial research question. Also necessary is the elicitation of the set of independent and dependent variables that will be used in the statistical test, as well as the selection of subjects to participate in the experiment. The quantitative experiment is designed to filter out external factors, and to avoid bias in the results. The experiment’s instrumentation, as well as a previous evaluation of its validity conclude the phase (Section III.B);
- Execution – consists in the instantiation of the previously established plan constrained by the specific circumstances found in the actual experiment. In this phase data is collected to gather empirical evidence (Section III.C);
- Data Analysis – the activities involved herein are the data set description and reduction, as well as the test of the hypothesis defined during the experiment plan (Section III.D);
- Results Analysis – since the overall research is constructed to allow comparability with other researchers’ results that would eventually repeat the experiments, this set of activities consists in packaging the results so they can be used by the community. This involves documenting the whole experimental process, and discuss the results achieved with the experiment, after the statistical analysis of the results has been performed, focusing on aspects such as the results’ interpretation, the study’s limitations, the suitability of inferencing to the population of BPMN modelers, and the identification of the learned lessons (Section III.E).
In the following sections is detailed, the actual empirical study taken place that instantiated the above mentioned activities.
Empirical study on BPMN
Empirical Study Definition
Following the guidelines of the framework proposed in previous section, we address here the activities that are part of the experiments’ definition. Those activities consist in:
- the specification of the research problem, by justifying the importance of the empirical study;
- the formulation of the research objective, by highlighting the aim of the study; and
- the definition of the context of the study, by defining the environment that constrains the previous two (research problem and objective).
The research problem under investigation is concerned with the assessment of the effectiveness of different kinds of process modelers being able of deliver BPMN process models with the same quality (i.e., process models with same degree of correctness). Therefore, the sample to be tested should come from BPMN models produced by process modelers with different technical backgrounds. The degree of correctness of process models produced by a kind of modeler should be measured and compared with the level of correctness of models built by modelers with different background.
From the attained results it is expected to confirm (or not) whether the process modeling language BPMN, can used to produce process models with the same degree of correctness, irrespective of process modelers technical background.
To be systematic and rigorous addressing the above mentioned problem, and to precisely delimit the empirical study’s boundaries, the research question was formulated as follow:
- What is the likelihood of process modelers with different technical backgrounds deliver BPMN models with the same level of correctness?
The activities described next were developed in order to get the answer the stated research question. First, the research question was refined into a research goal, which in turn, lead to the specification of a research hypothesis. To provide a grounded answer to the research question, one have to analyze a sample of process models built by process modelers with different technical background.
The setup of the experiment was based on the Goal-Question-Metric (GQM) framework . The GQM model has a hierarchical structure that starts from a top-level goal definition. This goal is refined into several questions that usually break down an issue into its major components. Each question is also refined into metrics. The same metric can be used to answer different questions under the same goal. The GQM framework is instantiated through a template with a certain number of items (study object, purpose, quality focus, viewpoint and environment). The template helps to the delimitation of the experiment’s boundaries. This is achieved by focusing on the relevant goal, determining the entities to measure, the dependent/independent variables to choose, and the hypothesis to formulate.
For the current empirical study the instantiation of the GQM framework, was made by matching the research objective of the study with the GQM’s top-level goal, and stating this goal in terms prescribed by the above mentioned template. So, the GQM for the empirical study was formulated as:
- Analyze BPMN models (study object),
- for the purpose of checking compliance of BPMN syntax by business analysts and process developers (purpose),
- with respect to the assessment of models’ degree of correctness (quality focus),
- from the point of view of the BPMN standard (viewpoint),
- in the context of an experimental study constrained by surrogates of actual BPMN process modelers (environment).
The last item of the GQM instantiation, the context of the experiment, determines whether the experimental results are generalizable or not to a broader context. It is important to make the context explicit for the sake of the results’ comparability, since each experiment can have its own distinct context.
Context Definition of the study
The study presented herein, was developed by collecting a sample of BPMN models, through an experiment carried out with students of two degrees, in a course on BPMN modeling taught at an academic institution, supported by a process modeling case study.
Although the conclusions of this empirical study can be generalizable to other BPMN modeling contexts, caution must be taken, before generalization, and further research is required to confirm the attained results.
Empirical Study Planning
This set of activities aims to the developing the topics regarding the definition of the research goal, formulation of hypothesis under study (together with the independent and dependent variables to be used for hypothesis’ testing), the criteria definition for selection of subjects participating in the experiment, the experiments’ design and instrumentation, and the assessment of the experiment’s validity.
Since the current empirical study has only one research objective, no sub-goal was derived. Therefore, the research goal was stated in the same terms of the research objective.
Hypothesis and Variables
The hypothesis presented here is intended to verify the existence of empirical evidence that could corroborate the claim that BPMN is a general purpose process modeling language, amenable of be used by any kind of process modeler. By not rejecting this hypothesis it was expected to contribute to back up the assertion of BPMN being suitable both for business analysts and process developers. Thus, the goal previously settled drove to the null hypothesis formulation in the following terms:
- H0: the quality of BPMN models, measured in terms of their correctness, has no significant difference, regardless the technical background of process modelers (business analysts or process developers).
The independent and dependent variables used for the hypothesis H0 are listed in the Table I. The dependent variable is specified in terms of the number of errors detected in the BPMN model, as result of violations of the syntax defined on the BPMN standard. The description of each variable refers the attribute to be measured, the computation rule to be applied, and the measurement unit assigned.
|Variable Role||Variable Name||Description|
|Independ.||Modeler Type||A nominal variable which identifies the technical background of the BPMN modeler (business analyst or process developers) in the sample.|
|Depend.||Total of Errors||An absolute scale variable that conveys the number of BPMN syntax rules violated in the BPMN model.|
The sampling strategy was a combination of the simple organization (all subjects of the sample treated equally) with convenience sampling (subjects chosen based on their easier availability).
The experiment was conducted on an academic institution, and included subjects attending two degrees with distinct curricula. The participants were first cycle undergraduate students, in the second semester of their academic year. The characteristics of the two graduations are the following:
- Technology and Industrial Management – this degree includes courses with topics covering industrial processes and information systems (with process modeling using BPMN). Given the business and industry oriented nature of the topics taught, the students from the second year of this degree were chosen as surrogates of business analysts;
- Informatics Engineering – in this degree computer science topics are taught, covering procedural and object-oriented programming languages, systems analysis and design (using UML) and process modeling (using BPMN). The more prevalence of software development topics in this degree, led to choose students from the third year of this graduation as surrogates of process developers.
The students of both degrees attended a BPMN process modeling course in the same semester, which was a good opportunity to enroll them in the same experiment.
The context of the course was a blended-learning environment with all the students having a first initiation with the BPMN language, with support of an e-learning platform and regular classes. Course materials (course’s notes and tools’ tutorials) were provided to the students regarding BPMN modeling, as well as the course’s assignments. All the assignments contributed to the final score of the course with certain weight. One of the mandatory assignments the students had to accomplish was the BPMN case study that was used as part of the current experiment. A complete description of the case study is provided in Annex.
The total number of participants in the experiment was 51 (18 acting as process developers and 33 as business analysts). The experiment had two phases:
- Training – this phase was mainly a period of familiarization with the BPMN language. Therefore, the output produced in this phase was not used in data analysis. A detailed information about the financial services domain, supported by Automated Teller Machines (ATMs), was delivered to the students. The process modelers were asked to read, understand and model the exercise using BPMN. A deadline of one month was given for accomplish the preliminary assignment.
The main intent of this phase was that students could: (1) be acquainted with the process modeling tasks when performing the actual case study; (2) practice process modeling and BPMN concepts taught in regular classes; (3) get familiarization with a CASE tool provided, the Sparx Enterprise Architect, capable of basic syntactical checking of BPMN models.
- The Experiment – the actual experiment took place in a laboratory class with a time frame of two hours. At beginning, the instructor explained to the participants what they were expected to do during that experiment, namely: (1) read the business process case study description (a shorter part of the overall case presented in the previous phase); (2) complete the BPMN model and store it in a delivered repository of the BPMN tool. As part of this repository, a baseline of the solution was provided to the students, in order to confine the set of modelers’ solutions, to a specific part of the general problem.
As previously mentioned, the primary goal of the experiment was to evaluate of BPMN models’ correctness assuming that they were built by process modelers with two distinct modeling profiles (business analysts and process developers).
This section describes the design of the experiment and the role played by undergraduate students, as surrogates of process modelers, on artifacts production.
For the non-experimental study conducted to test the H0 hypothesis, we chose a quasi-experimental design. A quasi-experimental design involves the selection of groups upon which a variable is tested, without any random pre-selection process. The researcher treats the situation as an experiment even though strictly, by design, it is not. The independent variable may not be manipulated, treatment and control groups may not be randomized or matched, or there may be no control group, so the researcher is limited in drawing conclusion .
Quasi-experiments with non-equivalent groups are often used when interventions are carried out on academic context and the groups correspond to different classes. For a case study in an academic environment, such the one we are describing and given the constraints, a quasi-experimental design was the best option to be applied, since the figures and gathered results, allow some sort of statistical analysis.
The time and resources available in the experiment, as well as the constraints of academic rules, turns out not feasible to constitute a control group. In addition, without the need of students’ randomization, the time and resources assigned to the experimentation was reduced.
So, in the experiment students were assigned to non-equivalent groups and all submitted to the same treatment. The division of subjects between groups was convenient in order to cause as little disruption in classes as possible. Albeit the groups’ probabilistic equivalence was lost, one could still compare the groups. To compose the non-equivalent groups, the students were assigned based on their own graduation. During the experiment, the two sets of modelers with different backgrounds, produce the BPMN models using the same testing factor: the proposed case study. One group, composed by surrogate of business analysts, was more knowledgeable of organizational processes and had more domain skills. The other group, with surrogates of process developers, had more modeling proficiency and previous knowledge of other graphical notations, such as UML, as well as programming skills. It was expect that any deviation in results attained by the groups, would be due to the fact that BPMN was not equally suitable for business analysts and process developers, unlike advocated by the BPMN standard .
The assessment of process modelers’ interventions effectiveness, besides their different skills, was ensured by a post-test. Both groups received the treatment, over the same period of time, and got exactly the same support from the teacher in charge for monitoring the experiment. We relied in statistical analysis to determine whether the characteristics of the groups had a significant effect on the attained results.
A concise representation of the post-test quasi-experiment with non-equivalent groups and between-subjects design can be represented as:
N1 X O
N2 X O
This means that each non-equivalent group (N1 and N2) was subject to a treatment (X) and the observation (O) of the responses.
The data from the sample were produced by undergraduate students, behaving as surrogates of process modelers, in a time frame of two hours. They had to complete a provided baseline with the case study’s solution, by designing the new required BPMN models. The new models had to be delivered in a file format readable by the Enterprise Architect tool.
After collecting all the repositories with participants’ solutions, they were loaded in order to verify the BPMN models for possible syntax violations. The verification process was automatically made, using a BPMN model checker tool described in . This tool allows the verification of BPMN models against the BPMN metamodel, enriched with well-formedness rules and best practices suggested by practitioners, implemented as OCL invariants. The BPMN syntax violations detected in BPMN models were stored in a text file, for further processing and transformation into an SPSS data file. Eventually, the statistical treatment required for testing the formulated hypothesis could be carried out.
The instrumentation of the study required off-the-shelf tools. Each tool was used independently. The experiment’s environment followed the pipes and filters software architectural style . The role of each tool is detailed next:
- Enterprise Architect – a BPMN graphical editor used by process modelers to build the model of the case study;
- Eclipse – the IDE for building the BPMN2USE Java application used for querying and extracting from the Enterprise Architect repository, the solution built by each process modeler;
- SPSS – a statistical tool for processing and analysis of collected data. The tool was also used to perform the statistical hypothesis test;
- USE (UML based Specification Environment) – a tool that allows the specification of models and metamodels using the UML class diagram, enriched with expressions in OCL to specify both integrity and constraints .
- BPMN2USE – a transformer that takes as input a BPMN graphical model produced with the Enterprise Architect tool and instantiates the BPMN metamodel using the USE syntax .
Furthermore, as part of the instrumentation process, the participants in the experiment received training material regarding BPMN, as well as a baseline with the template of the ATM case study solution.
Threats to validity
One of the kind of threats that is inherent to statistical tests’ usage may be due to an inconsistent administration of the treatment in different groups. These variations could happen if different people apply the treatment, as it was the case.
Empirical Study Execution
After the planning phase the experimental work was carried out on a laboratory class of the course. As previously mentioned, the case study assignment was the basis for data collection regarding the experiment together with the pedagogical objective of practicing the concepts acquired on the course. All the students that started the experiment concluded it, so there were no subjects’ mortality regarding the experiment. Nondisclosure of individual responses was ensured to all participants.
Before starting of the experiment, at the beginning of the on class lab, the students received the case study’s description, as well as the repository of the CASE tool with the baseline of the assignment’s solution. Students were also briefed about the data that would be collected during the class. However, they were not informed about the details of the research, since this could jeopardize the validity of the results. The case study accounted for students’ final grade, so there was an incentive to perform it well.
After the students carried out the experiment, as part of the lab class, the activity of results’ verification took place. The verification of the BPMN models did not interfere in the delivery of the solutions by students, since it was made off-line, after the class. The verification process was carried out by a panel of instructors, using mainly the BPMN2USE and USE tools. After finishing the verification of the experiment’s outputs, information regarding number of errors found in the models was made available for analysis. For the present experiment was not considered relevant the severity of the errors but how much of them were found in a process model.
Empirical Study Data Analysis
In this phase collected data regarding design errors, incurred by process modelers with different backgrounds, were analyzed. This process is detailed in next sections and involves the description of data sets, as well as the test of hypothesis raised in the experiment’s planning.
In this section data exploration begins by describing the variables in the data set.
In Table I we already identify the dependent variable Total of Errors (total of found errors found in models) and the independent variable Modeler Type (possible background of process modelers). The positive skewness and kurtosis (Table 2) of the dependent variable indicates an asymmetric (skewness = 0.792 > 0) and leptokurtic distribution (kurtosis = 1.19 > 0), with higher frequency of lower values (see Table II). This behavior of global errors distribution (irrespective of process modeler background) is similar to the errors distribution considering each particular type of modeler per se.
|Statistic||Total of Errors|
Table III summarizes the results of the two tests which confirm the non-normality of the variable. The null hypothesis for each test was based on the assumption that the sample comes from a Gaussian (normal) distribution. Conversely, the alternative hypothesis was that the sample comes from a non-normal distribution. With a p-value (sig) < 0.05 in test for Total of Errors variable, we could not assume the normal distribution of data and therefore we had to use non-parametric tests to assess H0 hypothesis.
|Total of Errors||0.143||51||0.011||0.952||51||0.038|
|a. Lilliefors Significance Correction|
In order to verify the H0 hypothesis, we had to submit the collected data to a statistical test. The result of the test must show whether, there is enough statistical evidence for not rejecting the claim that the average number of errors in BPMN models is not significantly different for the types of process modelers with different technical background (business analysts or process developers).
The Mann-Whitney U (M-W U) test for two unpaired samples was the non-parametric statistical test applied for verification whether the means of two samples differ significantly. The samples considered were the two distinct sets of BPMN models built by surrogates of business analysts and process developers. We tested whether the means of Total of Errors for both sets of BPMN models differ significantly. By running this hypothesis test, one gather statistical evidences regarding the quality (correctness) of BPMN models, namely whether it is significantly different considering the technical background of process modelers.
As previously mentioned, the hypothesis H0 aims to verify whether there are significant differences between on the number of BPMN syntax errors found in process models built by process modelers with different backgrounds. To conclude about this it was necessary to compare statistics of two unpaired groups of BPMN models and test whether the two groups’ means differs significantly. Our independent variable was Modeler Type variable. We use it to distinguish between the two groups of BPMN models: one regarding models built by business analysts and the other of models delivered by process developers.
The Mann-Whitney U test is a non-parametric test aimed to assess whether two samples come from the same population. Table IV summarizes the information concerning the computed ranks for BPMN models sample. The test starts by ranking all the observations, disregarding the type of modeler they come from, with the values sorted in descending order. The columns of Table IV present the statistics attained by modeler type (1-Developer, 2-Analyst), the corresponding counting of BPMN models (N) delivered, the errors mean (Mean Rank) and the sum of errors (Sum Ranks).
|Variable||Modeler Type||N||Mean Rank||Sum of Ranks|
|Total of Errors||1-Developer||18||27.81||500.5|
18 of the analyzed BPMN models were built by process developers, while the remaining 33 came from business analysts. The Mann-Whitney U test is summarized in Table V, which lists the Mann-Whitney U statistic, the Wilcoxon W statistic, the test’s Z score, and the 2-tailed asymptotic significance (Asymp.Sig (2tail)). This test concludes for non-rejection of the null hypothesis at 5% of level of significance (Asymp. Sig. = p-value > 0.05).
The results from the Mann-Whitney test are confirmed by the Two-Sample Kolmogorov-Smirnov test. This is also a non-parametric test which measures the difference in shapes of the distributions of errors for the two groups of BPMN models. This test also relies on the rank classification presented in Table IV.
Table VI presents the summary of the test’s results, where the difference between the most extreme absolute (Absolute) and positive (Positive) is 0.177, while the difference between the most extreme negative (Negative) is -0.061. A Kolmogorov-Smirnov Z score of 0.603 and a 2-tailed asymptotic significance Asymp.Sig.(2 tail) = p-value > 0.05 confirm the results presented for the Mann-Whitney U test.
These results indicate that the BPMN models produced by business analysts are not significantly different regarding syntax rule violations than the BPMN models produced by process developers.
|Asymp. Sig. (2-tailed)||0.519|
|a. Grouping Variable: Modeler Type|
|Most Extreme Differences||Absolute||0.177|
|Asymp. Sig. (2-tailed)||0.86|
|A. Grouping Variable: Modeler Type|
Empirical Study Results
The final discussion regarding the experimental study is focused on the interpretation of the results, the inference made regarding the extent by which the study’s results are expected to be representative of the population, as well as on the identification of the learned lessons.
The outcome of the hypothesis test was that the empirical evidence did not allow us to reject the hypothesis that BPMN models built by business analysts and process developers have the same degree of correctness.
This is in line with the claimed suitability of BPMN, as process modeling language, amenable both for business analysts and process developers . Indeed, we found business analysts as able to cope with the usage of the BPMN constructs and rules as process developers. However, since both incur in faults, during process modeling, we also concluded that automatic BPMN model checker would be beneficial for both process modelers giving hints and alerting for syntactical errors occurring throughout the modeling process.
The collected evidences throughout the experiment suggest that the results attained by participants, which revealed similar BPMN modeling skills regardless the technical background they come from, can be extrapolated for undergraduate students from the two degrees (Informatics / Technology and Industrial Management) of academic institution they are enrolled. Assuming that these students have basically a similar academic profile to those of other polytechnics/universities with equivalent degrees, the results could also be generalized to those students. However, this assumption should be tested by replicating this experiment in such institutions.
One could expect the results to hold for the population as well, if evidence could be collected that confirm that results attained by students, with a similar profile of the ones of our experiments, could be compared to those obtained by novice professionals. Nevertheless, this inference should be supported through replications in professional environments. Extrapolating the observed behavior for seasoned experimenters requires that other studies must be conducted.
A large amount of time and effort was required for preparing and conducting the experiments with students. After gathering BPMN models we were able to feed them into our pipeline of applications to support the data analysis. Data converters and transformers were used for automating repetitive tasks.
While conducting the experiment, we realized that some steps on the experimental process could be improved. Also, some challenges could have a better approach, in future replications of this experiment, namely the following:
- Narrow the scope of the case study, in order to include in the analysis of results not only the number of errors in models but also the degree of coverage of functional requirements by models;
- Consider a new version of the case study, with a treatment consisting in using a model checker by participants. Modelers should also be able to decide which rules they wanted to enforce upon BPMN models (e.g. control-flow vs. data flow, or sets of best-practices or standard rules);
- Measure the effects on students’ learning curve of BPMN when using a model checker for the BPMN standard.
During the whole experimental process, the practical details of the experimental process were registered. The feedback provided by students while learning and applying BPMN to the case study helped writing this paper. These information were particularly valuable for packaging the experiment, as well as for its future replication.
Empirical studies regarding BPMN characteristics for different users’ profiles are not found. However we could found some empirical studies analyzing BPMN characteristics, as well as comparing BPMN with other modeling languages. Those studies are presented next highlighting the main differences between those works and the one presented here.
In  is presented a set of measures to evaluate the structural complexity of business process models at a conceptual level. However, conversely to the work herein presented, there is only an experimental plan, and not the actual results, to develop a family of experiments to applied to an integrated population by experts in business analysis and software engineering, with the intention of validating the proposed metrics as well as evaluating quality aspects of the business process models at a conceptual level.
The previous study was developed in  the empirical validation of the measures was carried out along with a linear regression analysis aimed at estimating process model quality in terms of modifiability and understandability. The study was applied to a homogeneous sample of participants (students of computer science and information systems), which differs from the heterogeneous groups of students used in our study, with surrogates of process developers and business analysts. As a result of carrying out a correlation and a multiple linear regression analysis from the data collected, it was identified a reduced group of measures useful in predicting several aspects when evaluating the understandability and modifiability of business process models expressed with BPMN. Of measures analyzed, after carrying out a correlation and a principal components analysis of the variables, they conclude that 12 of the measures are useful for predicting aspects of understandability of a business process model. With regard to modifiability, other measures were identified as good predicting variables. By crossing the data obtained they also conclude that some measures can be considered good predictors for both dependent variables. According to the authors, the regression models obtained represent a guideline for defining understandable and modifiable processes or for predicting such characteristics in those which already exist. They seem to be also useful for guiding process improvement initiatives.
In  is presented the results from an empirical study that examines the BPMN and the UML Activity Diagram (UML-AD) modeling by business users during a model creation task. This study, differently of the present one, compare BPMN with another modeling language, and the results indicate that the (UML-AD) is at least as usable as BPMN since neither user effectiveness, efficiency, nor satisfaction differ significantly.
Another study compare BPMN against Event-Driven Process Chains (EPC). The study measured the comprehension and problem-solving capacities of students as surrogates for business users . No significant differences were identified, and the study recognized to have a differing focus: examining teaching effects and therefore compared the performance of trained participants in the EPC group versus untrained participants in the BPMN group.
This work intended to validate, through an empirical study, whether the BPMN is a process language suitable for process modelers with different technical skills. This was done by assessing whether models produced by people with different technical profiles have the same degree of compliance with BPMN syntax rules.
We proposed, as research method, the scientific method, based in previous studies on Experimental Software Engineering. Along the experimental study, the research problem was aligned with the research objective and the context of the experiments was defined. Next, a set of activities were developed, regarding the specification of how the experimental study had to be performed. This was done through more detailed decisions concerning the context of the experimental study, namely the formulation of the hypothesis under study, the elicitation of the set of independent and dependent variables that were used in the statistical test, the selection of subjects to participate in the experiment, the experiment’s design and instrumentation, as well as a preliminary evaluation of the experiment’s validity.
The next set of activities consisted in the instantiation of the previously established plan, constrained to the specific circumstances found in the actual experiment. The activities of data set description, its reduction, and the hypotheses test defined during the experiment plan were also developed. Finally the activities the results’ packaging and the discussion of possible generalization of results to the real world were performed. This was about documenting the whole experimental process, and discussing the achieved results, focusing on particular perspectives such as the results’ interpretation, the results’ inferencing to the population, as well as the identification of learned lessons.
Throughout the paper we have instantiated the mentioned activities within an experimental study, using a sample of BPMN models. The empirical evidence did not allow us to reject the hypothesis that BPMN models built by business analysts and process developers have the same quality.
Future works will be focused on verifying the results using surrogates, by replacing them for actual professionals working as business analysts and process developers in the industry.
This work was supported by Portuguese funds through the Center of Naval Research (CINAV), Portuguese Naval Academy, Portugal.
1. OMG, “Business Process Model and Notation (BPMN),” Book Business Process Model and Notation (BPMN), Series Business Process Model and Notation (BPMN) dtc/2010-05-04, ed., Editor ed.^eds., 2011, pp.
6. A. Jedlitschka and D. Pfahl, “Reporting Guidelines for Controlled Experiments in Software Engineering,” Proc. 4th International Symposium on Empirical Software Engineering (ISESE 2005), IEEE Computer Society, 2005, pp. 95-104.
9. AcademyHealth, “Research Methods and Techniques,” 2017; http://www.hsrmethods.org/glossary.aspx.
13. M. Gogolla, et al., “Validation of UML and OCL Models by Automatic Snapshot Generation,” Proc. 6th International Conference on the Unified Modeling Language (UML’2003), Springer, Berlin, LNCS 2863, 2003.
16. D. Birkmeier and S. Overhage, “Is BPMN Really First Choice in Joint Architecture Development? An Empirical Study on the Usability of BPMN and UML Activity Diagrams for Business Users,” Research into Practice – Reality and Gaps: 6th International Conference on the Quality of Software Architectures, QoSA 2010, Prague, Czech Republic, June 23 – 25, 2010. Proceedings, G. T. Heineman, et al., eds., Springer Berlin Heidelberg, 2010, pp. 119-134.