APP下载

Multi-dimensional Security Range Query for Industrial IoT

2022-08-24AbdallahAbdallahAymanAlyBassemFelembanImranKhanandKiIlKim

Computers Materials&Continua 2022年7期

Abdallah Abdallah, Ayman A.Aly, Bassem F.Felemban, Imran Khanand Ki-Il Kim

1School of Engineering Technology, Al Hussein Technical University (HTU), Amman, 11831, Jordan

2Department of Mechanical Engineering, College of Engineering, Taif University, Taif, 21944, Saudi Arabia

3Department of Electrical Engineering, University of Engineering and Technology Peshawar, Pakistan

4Department of Computer Science and Engineering, Chungnam National University, Daejeon, 34134, Korea

Abstract: The Internet of Things (IoT) has allowed for significant advancements in applications not only in the home, business, and environment, but also in factory automation.Industrial Internet of Things (IIoT) brings all of the benefits of the IoT to industrial contexts, allowing for a wide range of applications ranging from remote sensing and actuation to decentralization and autonomy.The expansion of the IoT has been set by serious security threats and obstacles, and one of the most pressing security concerns is the secure exchange of IoT data and fine-grained access control.A privacy preserving multi-dimensional secure query technique for fog-enhanced IIoT was proposed in light of the fact that most existing range query schemes for fog-enhanced IoT cannot provide both multi-dimensional query and privacy protection.The query matrix was then decomposed using auxiliary vectors,and the auxiliary vector was then processed using BGN homomorphic encryption to create a query trapdoor.Finally, the query trapdoor may be matched to its sensor data using the homomorphic computation used by an IoT device terminal.With the application of particular auxiliary vectors,the spatial complexity might be efficiently decreased.The homomorphic encryption property might ensure the security of sensor data and safeguard the privacy of the user’s inquiry mode.The results of the experiments reveal that the computing and communication expenses are modest.

Keywords: Internet of things; data security; ciphertext; privacy encryption

1 Introduction

The Internet of Things (IoT) is regarded as the third revolution in the development of the information technology industry after computers and the Internet.Relying on the rapid development of smart homes, Internet of Vehicles (IoV), smart medical care, and smart grids based on the IoT, the era of the Internet of Everything (IoE) has arrived [1-5].The latest statistics from the Statista website show that it is expected to exceed 30 billion interconnected devices in 2020 [6-10].

Countless smart terminals in the IoT transmit the massive amounts of data they sense through complex information and communication channels to data centers with huge data storage and processing capabilities, which can be abstracted as a“terminal transmission pipeline cloud”architecture, which also corresponds to the IoT three logical levels: perception layer, transmission layer and application layer [11-14].Because of its ubiquitous network, comprehensive perception, reliable transmission, and intelligent processing, the characteristic elements are becoming more prominent,causing the IoT to face very severe security challenges [15].

The IIoT is the integration of the IoT system and the industrial automation system.It has the characteristics of comprehensive perception, interconnected transmission, and intelligent processing.It is one of the main development directions of the IoT technology in the future [16].One of the important tasks of the IIoT application system is to analyze and process the sensor data of the IoT devices.If all the sensor data of the IoT devices are collected in the control center for processing, not only may the system be delayed due to network congestion, but also it will expose the system to the risk of a single point of failure [17-20].The edge computing model that has emerged in recent years can effectively alleviate the performance bottlenecks and security risks that exist in the traditional centralized computing model by making full use of the computing power of the network edge devices[21-25].By deploying fog devices at the edge of the IIoT, the sensor data in the IoT can be preprocessed on the edge of the network to improve the quality of service (QoS) [26-30].

The protection of sensor data of IoT devices is one of the important issues that need to be considered in fog-enhanced IIoT systems [31].For example, in a fog-enhanced industrial Internet system, operation and maintenance personnel need to monitor whether the IoT devices in the system are operating normally, that is, query whether the sensor data of the IoT devices is in a reasonable range to determine their operating status.For security considerations, sensor data and query results cannot be leaked to any other unauthorized parties during the query process, so a secure data query plan must be designed [32].In terms of data security query, many scholars have considered outsourcing the query of encrypted data.For example, the authors in [33] used Bloom filters to build a query index, which realized the fuzzy multi-keyword sort range query of encrypted data.The authors in[34] proposed a range query method that integrates bucket partitioning, identity authentication and verification code technologies for the data query problem in a two-layer wireless sensor network(WSN).The authors in [35] studied the multi-dimensional confidential range query of outsourcing data, and proposed an individualized query scheme with flexible permissions.In recent years, the range query for non-outsourced encrypted data has attracted the attention of researchers.For example,Reference [36] proposed a range query matrix method, and designed a single-dimensional security range query scheme using homomorphic encryption technology.Existing research on range query that supports privacy protection features mainly focuses on the query range and the confidentiality of the query subset that meets the conditions.Multi-dimensional query is not supported, and the communication overhead is relatively high.

Because some large-scale IoT devices in the IIoT have many different types of sensors, the monitoring of their status requires multiple sensor data belonging to the device, that is, multidimensional data query, while the traditional data query can only use multiple queries will increase the system’s computing and communication costs.Therefore, it is necessary to design a multi-dimensional data query solution for the IIoT.In addition, the confidentiality of sensor data and the protection of user query mode need to be considered when designing the scheme.Constructing multi-dimensional data-oriented query trapdoors and matching based on homomorphic encryption is a feasible idea to solve this problem, such as the Boneh-Goh-Nissim (BGM) encryption algorithm [37] and the algorithm proposed by Reference [38].The user performs confidential processing of the query range to form a query trapdoor.After receiving the query trapdoor in the form of ciphertext, the IoT device matches its own sensor data, and uses the self-blindness of the homomorphic encryption algorithm to form a new sensor data and report the ciphertext to the fog device.The fog device collects the ciphertext of the sensor data that meets the conditions submitted by the IoT device to which it belongs,and returns the collected result to the inquiring user for calculation and decryption.

This paper proposes a multi-dimensional security range query solution with privacy protection features for the fog-enhanced IIoT.The solution uses query interval matrix generation, decomposition and reconstruction based on auxiliary vectors, and other methods to reduce the storage and communication costs.Then, uses the homomorphic encryption to achieve privacy protection.The proposed algorithm first maps the range ofn-dimensional data involving multiple different dimensions and different starting points to a query matrix.The size of the matrix is determined by the total query interval size and is not affected by the required query dimension.Second, construct multiple auxiliary vectors to decompose and reconstruct the matrix.Third, we used the BGN homomorphic encryption to design the query trapdoor generation and matching.Finally, the feasibility and performance of the proposed and existing schemes are verified and analyzed through simulation experiments.

2 Pre-Knowledge

This section introduces the two basic mathematical methods used in the proposed algorithm,namely the large composite numberN=pq-order bilinear mappinge:G×G→GTand the BGN homomorphic encryption algorithm.

2.1 Large Composite Order Bilinear Mapping

Letp,qbe two large prime numbers of the same length, that is, their bit length |p|=|q|.LetN=pq, when there is a computable mapping relationshipethat satisfies the following three properties between the groupGande:G×G→GT, the mappinge(G,GT) can be called composite order bilinear mapping.

1) Bilinear.For any (g,h)∈G2,a,b∈ZNande(ga,hb) =e(g,h)ab.

2) Non-degenerate.There exists g∈G, so thate(g,g) is on the groupGTof orderN.

3) Computability.There is an efficient algorithm,when(g,h)∈G,alle(g,h)∈GTare computable.

The large composite order bilinear parameter generatorCGen(K) is a probabilistic algorithm,which takes the safety parameterKas an input value and outputs a five-tuple (N, g,G,GT,e).Among them,N=pq,pandqare prime numbers of 2Kbits.GandGTare two groups of orderN.g∈Gis a generator of orderNof groupG.e:G×G→GTis a non-degenerate, bilinear mapping that can be calculated efficiently.

Let g be a generator ofG, thenh∈gq∈Gcan generate ap-order subgroupGp= {g0,g1,...,gp-1},and g′= g∈Gcan generate aq-order subgroupGq=Gp= {g′0,g′1,...,g′q-1}.At this time, the problem of sub-group decision (SGD) on the groupG[39] can be expressed as: given a five-tuple (N,g,G,GT,e), if the elementxis randomly from the groupGor its subgroupGqis selected, it is difficult to determine whetherxis an element inGq.If this problem is established, the security of the BGN homomorphic encryption algorithm is guaranteed [40-46].

2.2 BGN Homomorphic Encryption Algorithm

The BGN homomorphic encryption algorithm is a well-known fully homomorphic encryption algorithm, which consists of three phases, namely the key generation, encryption and decryption.

1) Key Generation Phase

Given the safety parameterK, the composite order bilinear mapping parameter group (N, g,G,GT,e) is generated by the generatorCGen(K).HereN=pq, wherep,qare prime numbers of twoKbits,GandGTare two groups of orderN, and g∈Gis a generator of orderNof groupG.Leth= gq,his a random generator of orderpofG.At this time, the public key pk = (N,G,GT, e, g,h), and the secret key sk =p.

2) Encryption Phase

Suppose the message space which contains only integers and whose capacity is determined by the specific application={0,1,...,Δ},Δ≪q.When encryptingm∈,select a random numberr∈ZN,then the ciphertextc=E(m,r)=gmhr∈G.

3) Decryption Stage

Given the ciphertextc=E(m,r)=gmhr∈G, the plaintext can be recovered with sk.Observation shows thatcp= (gmhr)p= (gp)m, if you want to decryptm, it is equivalent to solving the discrete logarithm problem of cpwith gpas the base, and since 0≤m≤Δ, the time to solve this problem using the Pollard lambda algorithm and the complexity isO(Δ).

In addition, the BGN homomorphic encryption algorithm has self-blindness, that is, given the ciphertextE(m,r)∈G,E(m,r+r′) =E(m,r)hr′∈Gis valid ciphertext ofm.

The BGN homomorphic encryption algorithm has the following homomorphic characteristics.

a) Additive homomorphism on groupG.GivenE(m1,r1)∈GandE(m2,r2)∈G, we have

For the sake of simplicity, the random number termcan be ignored and rewritten asE(m1)E(m2) =E(m1+m2)∈G.

b) Multiplicative homomorphism on groupG.GivenE(m1,r1)∈G, andm2∈, there areE(m1,r1)m2=E(m1m2,r1m2)∈G, that is,E(m1)m2=E(m1m2)∈G.

c) Multiplicative homomorphism from groupGto groupGT.GivenE(m1)∈GandE(m2)∈G,e(E(m1),E(m2)) =ET(m1m2)∈GT.

d) Additive homomorphism on the groupGT.GivenET(m1)∈GTandET(m2)∈GT,ET(m1)ET(m2) =ET(m1+m2)∈GT.

e) Multiplicative homomorphism on the groupGT.GivenET(m1)∈GT,m2∈ˆ,E(m1)m2=E(m1m2)∈GT.

3 Multi-Dimensional Query Scheme with Privacy Protection Features

The proposed algorithm is an aggregation query solution for various fog-enhanced IIoT.The multi-dimensionality in this article means that the sensor data of an IoT device is composed of data of multiple different dimensions.The data includes water temperature, voltage, volume, material allowance, etc.According to actual needs, it may be necessary to query the sensor data of different dimensions of the device at the same time.For example, in a factory, the average value of water temperature, voltage, volume, and material remaining can be counted, which cannot only understand and predict the operating status of the equipment in time, but also provide a basis for optimizing the production process.

It should be noted that in the actual IIoT environment, data have different dimensions and accuracy.For example, power consumption data, water consumption data, etc.may have decimals.Since each IoT device is unlikely to change its detection accuracy through software means such as over the air (OTA) upgrades after manufacturing, in order to simplify the calculations and adapt to the needs of multi-dimensional queries, the proposed algorithm is as follows.The sensing data with an accuracy of less than 1 needs to be converted when participating in the calculation.For example, the value of the power sensor data at a certain time node is 10.37 W.In the calculation, multiply it by 100,which becomes103710.37W.In the calculation, multiply it by 100, which becomes 1037 and then perform range query.In this way, the range query of multiple dimensions can be realized without loss of accuracy.In fact, the amplification factor of a certain dimension of data can be written into the device when it is deployed.

3.1 System Model

There are three types of entities in the proposed scheme, namely, fog equipment at the edge of the network, IIoT equipmentD= {D1,D2, ...,DN} located within the jurisdiction of fog equipment, and query users.The proposed model is shown in Fig.1.

Figure 1: Proposed system model

1) Industrial Internet of Things equipmentD={D1,D2, ...,DN} is a collection of a group of devices, which are distributed in each specific domain.Each IoT device not only has the ability to detect and collect specific data, but also has the ability to transmit data, so that each IoT deviceDkcan periodically report sensor data to fog devices in its domain.

2) The fog equipment in the model of this paper is located at the edge of the network, and each fog equipment has its own jurisdiction.This jurisdiction can be determined according to specific application scenarios, such as a production equipment, a workshop, or even a factory area.The fog device can collect and calculate the sensor data of the device within its jurisdiction,and complete the query request sent by the user.Compared with IoT devices, fog devices have stronger computing and storage performance, and can complete sensing data collection and response to IoT devices in real time.

3) In the model of the solution in this paper, the query user can directly generate multiple-dimensional range queries and send them to the fog device to obtain the required data from the fog device.For example, the user may want to know how many device sensor data within the range of fog devices can satisfy dimension 1, range [B1,T1], dimension 2, range [B2,T2],..., dimensionm, range [Bm,Tm].The sensor data of which devices meet the conditions.What is the average value of sensor data in each dimension of the device that meets the conditions?The fog device can collect the sensory data that meets the conditions fed back from all devices within the jurisdiction, and then generate the correlation degree based on the feedback data,and finally return the correlation degree and sensor data to the user.

3.2 Multi-Interval Query Matrix

The proposed algorithm is improved on the basis of the single query interval matrix algorithm,so that the multi-dimensional range query can be matrixed while maintaining low communication overhead.Through observation, it can be found that any given query interval can be decomposed into two types of rows, namely partial rows (PR) and continuous rows (CR).Fig.2 shows the query matrix after the mapping of two query intervals, the dark gray part is PR, and the light gray part is CR.This matrix structure provides the possibility to further compress the matrix.In general, a typical query interval can be decomposed to get two PR and one CR.

Figure 2: The query matrix after the mapping of two query intervals

Suppose the total number of query intervals isn.First, define four special auxiliary vectors.

1) Xk=(xk1,xk2,...,xkn).The generation rule is that when thei-th row elements in thek-th matrix are all 1, setxki=0; otherwise,xki=1.

This article deals with incomplete rows and continuous rows in the matrix respectively.First,split the incomplete rows corresponding tonquery intervals into multiple matrices R1, R2,..., R2nindependently, and split all consecutive rows into a matrix RC, as shown in Fig.3.Obviously, the original query matrix R=R1∪R2∪...∪R2n∪RC.

Figure 3: Query the split result of the matrix

At this time, all elements in the matrix can be obtained through vector operations.Use Rkto represent the matrix related to incomplete rows, and use RCto represent the matrix related to continuous rows.The AND operation of the matrix is expressed as a bitwise AND operation, and the matrix OR operation is expressed as a bitwise OR operation and sorted into a vector multiplication.The form of sum vector addition is shown in Eq.(3).

Among the 8n+4vectors involved in the operation,there are4n+1vectors whose elements are all 1,and donot need to be calculated and stored,namely{X1, X2,...,X2n,}.Further,the elements in YCare all 0, and no storage and calculation are required.Therefore, only need to calculate and store {Y1, Y2,...,Y2n,,XC} a total of 4n+2 vectors to complete the calculation and matrix reconstruction.For any elementR(i,j) in the matrix, since= 1,xki=1 (k= 1,2,...,2n), andyCj= 0,= 1, its value can be calculated by Eq.(4).

After calculation, it can be seen that the vector XChas no effect in the reconstruction, and the final number of vectors required is 4n+1.

3.3 Data Query Process

The program flow of the proposed method is mainly composed of three parts.First, the key management server generates the query key, distributes the private key to the user, and the public key to the fog device FDi.When the user performs a data range query, the private key is used to encrypt the query range and query dimension to generate a trapdoor, and send it to the fog device FDi.Then,after receiving the query trapdoor and query dimension ciphertext from the user, the fog device sends the query trapdoor to the IoT device under its jurisdiction for query.After the IoT device calculates the result ω of this query, it sends it to the fog device for aggregation.The fog device obtains ζ through aggregation and calculation and feeds it back to the user.Finally, the user calculates and decrypts the received ζ, and the query result can be obtained.The whole process is shown in Fig.4.The following will specifically introduce the process of this article.

Figure 4: System flow

To facilitate the description of the data query process, the main parameters used in the process structure are listed in Tab.1.

Table 1: Main parameters and their definitions

1) Key Generation

Given the security parameterK, the key management server generates a composite order bilinear mapping parameter set (N, g,G,GT,e) by the generatorCGen(K).WhereN=pq,p,qare twoKbit random prime numbers,G,GTare twoN-th order groups, g∈Gis anN-th order generator of groupG,e:G×G→GTis large Composite order bilinear mapping.Seth= gq, thenhis a random generator of orderpofG.The key management server generates public key pk = (N,G,GT,e, g,h), private key sk =p.After that, the key management server distributes the private key sk to the inquiring user, and distributes the public key pk to all fog devices FDi, and the fog device delivers to the IoT devices in its jurisdiction.The user sets its own message space asˆS={0, 1,...,Δ},Δ≪q.

2) Query Trapdoor Generation

For each certain use scenario, the number and type of dimensions involved in each query initiated by the user are determined.In the proposed scheme, the number of dimensions of the data to be queried is set ton, that is, the number of intervals that users query isn.It should be noted that the query intervals [Bquery,Tquery] referred to in the process are all multiplied sensing data intervals.The IoT has also performed the same multiplication processing on the data, so the data multiplication process will not be repeated.The trapdoor generation of a range query consists of the following steps.

a) Data Mapping and Conversion.Set any query dimension of this query as the first interval,and then determine the starting point of each interval in the query sequence according to the following rules.Among them,liis the interval length of thei-th interval.

That is, the starting point of anyi-th interval is the position after the end point of the (i-1)-th interval is increased by one interval length to prevent multiple intervals from being connected and forming too many CR blocks.After determining the starting point, the ending point of anyi-th interval is

For a query interval [Bqueryi,Tqueryi], thek-th elementuikin the query interval satisfies

The data offset βiof the query interval can be determined by the starting point of the query interval.Aftertheoffset,thek-thelementinthequeryintervalisvik,thenβiandviksatisfythecondition shown in Eq.(8).

Determine the matrix ordermfrom the end point of the last query interval, and then construct them-order query matrix R.In order to map the query interval to the query matrix R, mapvikto the matrix elementR(i,j), as shown in Eq.(9).

The ordermof the matrix is determined by the end point of the last query interval rather than the upper bound of a single query interval, so the storage space requirement of each vector has nothing to do with the upper bound of a single interval of multi-dimensional query, only the length of all query intervals and related, the storage overhead isO(m).

b) Vector Generation and Encryption.After generating the query matrix R, according to the method described in Section 3.2, thecorrespondingvectors X,Y, X′, Y′aregenerated fromthe query matrix, and the required 4n+1 vectors are selected for storage and encryption.which is

For the convenience of calculation, the vector Ykis used to further construct the vectork, as shown in Eq.(11).

According to Eq.(4), all elementsR(i,j) in the query matrix can be represented by the elements in the vector

The user selects 2n+1 random numbersr1,r2,...,r2n+1to encrypt all 4n+1 vectors to obtain

where

In order to meet the needs of multi-dimensional query, two values need to be added to the originalE(i), namely the queried dimension γiand the offset βirequired by the vector during operation.

Calculate the hash value Hiof the processed vector.

At this point, the query trapdoor α is generated.Send it to the fog device FDi, and then send it to each IoT deviceDkin its domain for query.In addition, the user also needs to generate and send the dimensional ciphertextET(n) required for this query to the fog device FDi.

3) Processing on the Device Side of the IoT

Corresponding to the dimension identifier γ sent by the user terminal, each IoT deviceDkhas its own dimension identifier.The IoT device sequentially extractsE′() in the query trapdoor α sent by the user and its corresponding hash valueHifor calculation and comparison.First, the IoT device calculates and compares theHisent by the user.If they are consistent,the dimension identifieriin this vector is taken fromE′(i) and compared with the dimension identifierof the device itself.After comparison, the IoT devices screen out queries that meet their own dimensions, and extract related vectors from the query trapdoor together and form a queryVector for the next step of calculation.The specific process is shown in Algorithm 1.The time complexity of Algorithm 1 isO(n) and is related to the numbernofE() vectors sent by the user to query the trapdoor.

Algorithm 1: Equipment Comparison Input:γ′k,α Output: queryVector 1: function DeviceCompare(E′(¯Y),H)2: I←0 3: queryVector←0 4: while i = 1 to n 5: Do H′I←Hash(E(¯Yi)||E(X′i)||E(X′C))6: if H′I= Hi 7: if γ′k= γi 8: queryVector←Build(E′(¯Yi), E(X′i),E(X′i+1),E(Y′i+1),E′(X′C))9: break 10: end if 11: end if 12: i + +13: end while 14: return queryVector 15: end function

After obtaining the queryVector, the IoT device first extracts the offset βkof this query from it to offset its own sensor data, obtain the offset value, and use it.The ElementShift function gets its position (i,j) in the matrix.At this time, the IoT device extracts the corresponding vector from the queryVector according to (i,j) for calculation.The specific process is shown in Algorithm 2.Each IoT device only needs to execute once, and the time complexity isO(1).

Algorithm 2: IoT device handling Input:γ′k, mk, queryVector Output:ωk 1: function DataShift(queryVector,mk)2:βk←Parse(queryVector)3: v′k←v*k-βk 4: end function 5: function ElementShift(v′k)6: while i = 1 to m do 7: while j = 1 to m do 8: if (i - 1)m + j = v′k 9: break 10: end if 11: end while 12: end while 13: return i, j 14: end function 15: function DeviceCompute(v′k, queryVector)16: i, j←ElementShift(v′k)(Continued)

Algorithm 2: Continued 17: E(¯Y1j), E(¯Y2j), E(x′1i), E(x′2i), E(x′Ci)←Parse(queryVector)18: rk1, rk2←Random()19: ck←e(E(¯y1j),E(x′1i))e(E(¯y2j),E(x′2i)),e(E(¯yCj),E(x′Ci))e(g,h)rk1 20: sk←cv*k ke(g,h)rk2 21:ωk←Build(ck,sk,γ′k)22: return ωk 23: end function

For theckandskdescribed in Algorithm 2, there are

By executing Algorithms 1 and 2, the sensor dataof the IoT device can be converted into the mappingckof the corresponding position of the query matrix in theGTgroup, so that a query is completed.In addition, the actual sensor data of the IoT device can also be returned throughskin the case of query matching.After completing the calculation, send ωkto the fog device FDiof the domain to which the IoT device belongs.

4) Treatment of Fog Equipment

After the fog device FDireceives the ωkfrom theksubordinate IoT devices, it extracts the resultckof then-dimensional query by the IoT device from it and calculates it.The fog device multiplies all dimensional datack.According to the homomorphism of the BGN algorithm, the result obtained is the sum of all the results fed back on thekdimensions.The matching degree σiof this query of FDiis sent to the user for this sum value.The difference of the encrypted query dimension informationET(n).The FDiconstructs σiand all the values of ωkinto ξiand sends it to the user.The specific process is shown in Algorithm 3.The time complexityO(n) of Algorithm 3 is related to the numberkof IoT devices to which the fog device FDibelongs.

Algorithm 3: Fog Equipment Handling Input:ω1,ω2,...,ωk,ET(n)Output:ξi 1: function FDProcess(ωk,ET(n))2: while k = 1 to n do 3: ck←Parse(ωk)4:σi←σick 5: end while 6:σi←ET(n)/σi 7:ξi←Build(σi,ω1,ω2,...,ωk)8: return ξi 9: end function

5) User Analysis Data

After the user receives the data ξisent by FDi, first extract the query matching degree value σiand decrypt it.If and only if σi= 0,the result returned by the fog device FDicompletely matches the query.The user multiplies the completely matched data returned by the fog device by the sub-dimensions, and calculates the number of completely matched devices.The specific process is shown in Algorithm 4.The time complexity is related to the numbernof fog equipment FDi, which isO(n).

Algorithm 4: User data analysis Input:ξ1,ξ2,...,ξi Output: S1, S2,...,Sγ, VD 1: function UserProcess(ξ1,ξ2,...,ξi)2: while i = 1 to n do 3:σi←Parse(ξi)4: S1, S2,...,Sγ←0 5: VD←0 6: if Decrypt (σi= 0)7:ω1,ω2,...,ωk←Parse(ξi)8: while 1 to k do 9: sk,γ′k←Parse(ωk)10: Sγ′k←Sγ′ksk 11: end while 12: VD++13: else 14:ξi←NULL 15: end if 16: end while 17: return S1, S2,...,Sγ, VD 18: end function

At this time, VDis the number of valid data returned.After decryptingSγ,the sum of the data that meets the query conditions under the dimension γ can be obtained.After processing, the unmatched ξiare all set to 0, and the remaining elements come from the fog device that has returned valid data,and the specific fog device that has returned valid data can be located.

4 Safety Analysis

This section analyzes the security of the solution in this article, including forward security,backward security, privacy protection, unlinkability and other key security features, and introduces some attack modes designed for encryption range query solutions in recent years.The security of the proposed scheme is analyzed in the face of these attack modes.

4.1 Forward and Backward Safety

Assuming that the adversaryAobtains the private key sk =pused in a certain query, it can only decrypt the ciphertext of the current query.Since each time a query is made, the user will use the newly generated public and private key to query, the adversaryAcan only know the security parameterKused by the user to generate the key using the private key sk =p, and cannot be calculated from this.Any information of the key previously generated by the user cannot decrypt any previously queried data except the current queried data.Therefore, the proposed scheme satisfies forward and backward security.

4.2 Privacy Protection in Query Mode

Suppose that adversaryAintercepts the trapdoor used by userUin a certain query α=, because it can only be obtained in public key pk=(N,G,GT,e, g,h) and the encrypted informationET(n) of the total dimension of the current query, so it cannot decrypt the specific content of the query trapdoor, and cannot know the current total number of query dimensions for each query.Furthermore, due to the semantic security of the BGN algorithm, the attacker cannot distinguish the content of the encrypted vector in the trapdoor, nor can it analyze the user’s query rules by intercepting the trapdoor multiple times.

In addition to the adversaryA, the IoT deviceDkand the fog device FDialso cannot know the specific information of the query trapdoor.Because the proposed scheme is different from the traditional query solution, as it uses homomorphic encryption calculation to replace the traditional comparison process.Dkdetermines its original sensor dataaccording to the query offset β and extracts the corresponding encryption in the query trapdoor for calculation.In this process,Dkcannot obtain the specific value of the query interval[Bk,Tk].In the solution,FDionly aggregates the feedback ωksent by the devices running in the IoT domain where it is located, and cannot know the specific query intervals in each dimension.Therefore, the query trapdoor in this paper will not reveal any useful information, such as query mode, query interval, etc., and the privacy of the query trapdoor is protected.

4.3 Sensing Data Confidentiality

Assume that adversaryAintercepts the message ωksent by the IoT deviceDkto the fog device FDiin a certain query.Since it cannot obtain information other than the public key and the total number of dimensions in the current query in the public information, it cannot decrypt ωk.It is impossible to know whetherDksatisfies this query or obtain the original sensor dataofDk.Since the IoT device uses different random numbers when sending information to the fog device, the adversaryAcannot analyze any useful information by intercepting ωkmultiple times.

Suppose that the adversaryAintercepts the message ξisent by the fog device FDito the user.Since it cannot obtain the private key sk =p, it cannot decrypt ξi, and cannot obtain the IoT devices that meet the query conditions and their sensor data.Same as the foregoing, the introduction of random numbers makes it impossible for adversaries to analyze and learn useful information such as query rules by intercepting ξimultiple times.

In addition to the adversaryA, the fog device FDicannot learn the raw sensor data of the IoT deviceDkwhether the query is satisfied.After FDireceivesDk’sfeedback ωk, since it can only master the public key pk, it cannot decrypt the specific content of ωk, and cannot know whetherDkmeets this query or obtains the original sensor dataofDk.Therefore, no party other than the inquiring user and the IoT device can obtain the sensor dataof the IoT device, and the confidentiality of the sensor data is guaranteed.

4.4 Unlinkability

In the proposed method, due to the use of random numbers, all query trapdoors generated by userUare random.Even if two or more query trapdoors are sent by the same user, external attackers cannot determine whether these queries come from the same user, which means that the attacker cannot associate two different queries to a specific user.Therefore, the proposed solution provides unlinkability, and the attacker cannot associate the identity of the queryer by intercepting the query trapdoor.

4.5 Query Trapdoor Integrity

In order to prevent data integrity damage that may occur during data transmission, in the proposed solution, when userUgenerates a query trapdoor, paired query vectorsE(),E() andconnect and calculate the message fingerprint Hash, and the generated hash value is sent to each IoT deviceDk.Before extracting the content of a certain set of encrypted vectors,Dkfirst uses the hash value to verify the message.If the verification fails, the set of vectors is regarded as a damaged vector and discarded.In this way, the data integrity of the query trapdoor is guaranteed.

4.6 Key Update

In the proposed solution, the query process for the sensor data of IoT devices is essentially a BGN-based homomorphic encryption calculation process.The process does not involve decryption operations.The IoT devices and fog devices only need to know the system public key pk and the private key sk is only in the hands of the inquiring user.Therefore, there are fewer entities involved in key management and update, and the cost of key management is lower.

4.7 Protection Against Other Attack Methods

In addition to the above-mentioned common types of security attacks, in recent years, there have been a variety of attacks against encryption range queries.Among them, Reference [40] attacked encrypted data query by means of access mode disclosure.Reference [41] attacked the deterministic encryption (DTE) and order preserving encryption (OPE) methods in the attribute scheme, and used encrypted information and public auxiliary data to complete the plaintext recovery.Reference [42]defined two basic sources of leaks, such as access mode leaks and communication capacity leaks,and developed a universal one that is applicable to any system that supports range query and whose access mode or communication capacity is leaked.Reference [43] proposed an improved replay attack on encrypted data using the data leaked in the range query, including query mode and sorting information.

Different from the database query scenarios discussed in [40-43], the proposed scheme mainly considers the range query with privacy protection features in the fog-enhanced IIoT, which is an aggregate privacy protection query.The proposed scheme can protect the privacy of the upper and lower bounds in the range query and the subset of devices that meets the query conditions, and will not reveal the related information such as query mode and keyword frequency.In addition, for any range query, the data volume of the query results returned by the proposed scheme is the same, and no flow information will be leaked.Therefore, the proposed scheme can resist the above-mentioned several typical attacks on the security range query.

5 Performance Analysis

This section analyzes and compares the performance of the proposed scheme with the existing similar schemes, mainly including the comparison of the calculation and communication overheads of the query trapdoor generation phase and the server query phase.All simulation experiments were carried out on a notebook computer configured with Intel i7-9750H 2.60 GHz, 16 GB memory, and running Windows10 system,using Java Math’s large number calculation library and JPBC library[44]to implement the algorithm.In addition, the operations that consume less time for addition operations,hash operations, etc., are ignored in the comparison.

5.1 Computational Overhead

5.1.1 Computational Overhead at Each Stage

For convenience, defineTm,To,Tp, andTemto represent the time cost of single integer multiplication,integer modular power,bilinear mapping,and point multiplication on elliptic curves,respectively.Letlbe the length of the query interval in each dimension,kis the total number of dimensions, andqis the upper bound of a single query interval.For solutions that only support single-dimensional queries,such as the Reference [36] solution, a multi-dimensional query is treated as multiple single-dimensional queries.Tab.2 shows the comparison of the computational cost of each stage.

Table 2: Comparing the computational cost of each stage

Both the proposed scheme and Reference [36] method involve the IoT devices directly at the edge of the network to participate in the query, so the sensor data does not need to be encrypted and stored on the server, and the encryption phase does not generate computational overhead.

5.1.2 Full-Dimension Query Calculation Cost

This section considers the computational cost of querying data of all dimensions for each query when the total number of dimensions is certain.Suppose the total number of dimensionsk=10,the length of the query intervall=36, and the upper bound of the query intervalq=50.Then the References [36,44-46] algorithms and the proposed algorithm takes about 22.641, 15.864, 20.191,21.638 and 20.360 ms to complete a full-dimensional query.

Fig.5 shows the comparison result of the calculation cost of the full-dimension query of the specified dimensions for the scheme in this paper and the existing schemes.Among them, the proposed algorithm and Reference[36]schemes introduces the bilinear mapping operations in the query process,and the computational cost is higher than that in [44,45] that uses the integer encryption operations,but References [44-46] schemes need to encrypt and send the sensor data to a third-party data storage server in advance, this process will inevitably have time delay and generate encryption overhead.The scheme in [36] do not need to encrypt and send sensor data, and support users to query sensor data in real time.It only supports single-dimensional query, so the overall computational cost is higher than the propose scheme when multi-dimensional query is performed.

Figure 5: Comparing the calculation cost of full-dimensional query

5.1.3 Partial Dimensional Query Calculation Cost

Partial dimension quantity query means that when the total number of dimensions is certain,each query only queries the sensor data of part of the dimensions.Assuming that the total number of dimensionsk=10, the query interval lengthl=36, and the upper bound of the query intervalq=50, part of the dimension queries with dimensions 5, 7, and 9 are performed respectively.Fig.6 shows the comparison results of the calculation cost of the proposed scheme and the existing schemes when performing partial dimension query.It should be noted that, considering that even if part of the dimension query is performed, the amount of data stored by the data storage server should be the same as that of the full-dimensional query.Therefore, for a solution that needs to encrypt and store sensor data, the encryption overhead is still equivalent to full-dimensional query.Although the proposed algorithm and literature [36,46] all uses the bilinear mapping operation, the query stage overhead is high, but because the proposed and Reference [36] schemes are for real-time query of sensor data, they do not needs to be encrypted and stored in advance, so when fewer dimension queries are performed under the condition of a given total number of dimensions (5, 7 queries), the overall computing overhead is still lower than that of a solution that requires encrypted storage of sensor data.Since Reference [36] only supports single-dimensional query, it needs to repeatedly send query trapdoors when performing multi-dimensional query, and its overall computational cost is also higher than the proposed scheme.

Figure 6: Comparing the calculation cost of different dimensions

5.1.4 Computational Overhead for Querying with Different Interval Lengths

The query of different interval length means that when the total number of dimensions, the upper limit of the query interval length, and the upper limit of the query interval are certain, each query only queries the finite interval length.Suppose the total number of dimensionsk=10,the upper limit of the query interval lengthl=100, and the upper limit of the query intervalq=100.When querying 25%,50%, and 75% length intervals respectively, the proposed algorithm and the query of different interval lengths refer to the case of the total number of dimensions, the upper limit of the query interval length,and the upper limit of the query interval, each query only queries the finite interval length.Suppose the total number of dimensions’k=10, the upper limit of the query interval lengthl=100, and the upper limit of the query intervalq=100.When querying 25%, 50%, and 75% length intervals respectively,compare the proposed scheme with existing schemes when completing a query calculation cost, and the result is shown in Fig.7.

Figure 7: Comparing the computational cost of queries with different interval lengths

According to the analysis of the calculation cost of the query of the number of dimensions in this article, it can be seen that for the storage scheme that needs to encrypt the original sensor data,the encryption cost is the same as the calculation cost of the full-dimensional query.Except for the proposed and Reference [36] schemes, the computational cost of the other three schemes is not related to the length of the query interval, but is related to the total number of query dimensions.However,due to References [44-46] schemes, the original sensor data needs to be encrypted and stored in advance,so that when the query interval length is large and the total number of dimensions is high, the overall computational cost is relatively high.The Reference [36] scheme does not support multi-dimensional queries,and repeatedly sends single-dimensional query trapdoors when performing multi-dimensional queries.It uses more vectors than the proposed scheme, and the computational cost is higher.

5.2 Communication Overhead

This section compares the communication overhead of the proposed scheme and the existing schemes.Set the total number of dimensionsk=10, the length of the query intervall=36, and the upper bound of the query intervalq=50 full-dimensional query, and compare the communication overhead of the whole process of different schemes.In order to compare the communication overhead of each scheme fairly, the following assumptions are made.

1) The security parameters (or key length) of each scheme are 128 bits.

2) The hash algorithm used in each scheme is SHA-1, therefore, the hash digest length of each scheme is 160 bits.

3) The random numbers used in each scheme are all 64 bits.

The communication process of the proposed scheme mainly lies in the sending stage of query trapdoor, the stage of fog device feedback from IoT devices and the stage of user feedback query from fog devices.Since the proposed scheme is to vectorize the query matrix, the communication overhead of the query trapdoor is relatively large, and the overhead of the other stages is relatively small.The communication cost of the query trapdoor generated of the proposed scheme during the sending phase of the query trapdoor is about 31016 B, the communication cost of the fog device phase of the IoT device feedback is about 1365 B, and the communication cost of the fog device feedback query user part is a bout1370 B,a total of about33751 B.Tab.3shows the communication overhead of the scheme in this article and the comparison scheme.

Table 3: Communication overhead comparison

Reference [44] scheme uses a homomorphic encryption algorithm in the integer domain improved from the SHE encryption scheme.Under the same conditions, the ciphertext length is small, but its algorithm is susceptible to noise in the ciphertext.When the noise in the ciphertext is large, it affects the decryption process.Reference [45] scheme has a slightly lower communication overhead than the proposed scheme, however this and Reference [44] schemes encrypt sensor data and query the data transmitted to the server, and it requires two servers to participate in the calculation, and the transmission delay is too large to realize the real-time query of sensor data.Reference [46] scheme query trapdoor is smaller in size, which makes its communication cost lower than the proposed and Reference [36] schemes, but its computational cost is high, and it does not support real-time query of sensor data.Reference [36] scheme that provides real-time data query only supports single-dimensional query, and multiple queries need to be initiated when querying a multi-dimensional sensor data, and its communication overhead is higher than the solution in this paper.

In summary, the proposed algorithm achieves a good balance in computing overhead, communication overhead, and support for multi-dimensional query features.

6 Conclusion

In this paper, a multi-dimensional security query scheme with privacy protection features with high communication efficiency is designed for the fog-enhanced industrial Internet of Things.The proposed scheme uses the BGN homomorphic encryption algorithm to encrypt the query trapdoor,and integrates matrix decomposition and reconstruction technology to protect the privacy of the query mode, while ensuring that the sensor data will not be leaked to unauthorized parties.Security analysis and simulation experiments show that the scheme in this paper can maintain a low level of computing and communication costs while ensuring multiple security features and multi-dimensional query functions.The next step will further optimize the proposed scheme, such as further improving communication efficiency and reducing communication overhead.

Acknowledgement:Taif University Researchers Supporting Project Number (TURSP-2020/260), Taif University, Taif, Saudi Arabia

Funding Statement:This study was supported by the Institute for Information & Communications Technology Planning&Evaluation(IITP)grant funded by the Korean government(MSIT)(No.2019-0-01343, Training Key Talents in Industrial Convergence Security).

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.