AI News and Guides

Explore the best AI News and Guides — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • Perceptual robotics

    Perceptual robotics

    Perceptual robotics is an interdisciplinary science linking Robotics and Neuroscience. It investigates biologically motivated robot control strategies, concentrating on perceptual rather than cognitive processes and thereby sides with J. J. Gibson's view against the Poverty of the stimulus theory. As a working definition, the following quote from Chapter 64 by H. Bülthoff, C. Wallraven and M. Giese from The Springer Handbook of Robotics, edited by Bruno Siciliano and Oussama Khatib, published by Springer in 2007, could be used: In the following we will apply the term Perceptual Robotics to signify the design of robots based on principles that are derived from human perception on all three levels in the sense of Marr. This includes a realization in terms of specific neural circuits as well as the transfer of more abstract biologically-inspired strategies for the solution of relevant computational problems.

    Read more →
  • Chunked transfer encoding

    Chunked transfer encoding

    Chunked transfer encoding is a streaming data transfer mechanism available in Hypertext Transfer Protocol (HTTP) version 1.1, defined in RFC 9112 §7.1. In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another. At any given time, no knowledge of the data stream outside the currently-being-processed chunk is necessary for either the sender or the receiver. Each chunk is preceded by its size in bytes and transmission ends when a zero-length chunk is received. The chunked keyword in the Transfer-Encoding header is used to indicate chunked transfer. Chunked transfer encoding is not supported in HTTP/2, which provides its own mechanisms for data streaming. == Rationale == The introduction of chunked encoding provided various benefits: Chunked transfer encoding allows a server to maintain an HTTP persistent connection for dynamically generated content. In this case, the HTTP Content-Length header cannot be used to delimit the content and the next HTTP request/response, as the content size is not yet known. Chunked encoding has the benefit that it is not necessary to generate the full content before writing the header, as it allows streaming of content as chunks and explicitly signaling the end of the content, making the connection available for the next HTTP request/response. Chunked encoding allows the sender to send additional header fields after the message body. This is important in cases where values of a field cannot be known until the content has been produced, such as when the content of the message must be digitally signed. Without chunked encoding, the sender would have to buffer the content until it was complete in order to calculate a field value and send it before the content. == Applicability == For version 1.1 of the HTTP protocol, the chunked transfer mechanism is considered to be always and anyway acceptable, even if not listed in the Transfer-Encoding (TE) request header field, and when used with other transfer mechanisms, should always be applied last to the transferred data and never more than one time. This transfer encoding method also allows additional entity header fields to be sent after the last chunk if the client specified the "trailers" parameter as an argument of the TE request field. The origin server of the response can also decide to send additional entity trailers even if the client did not specify the "trailers" parameter, but only if the metadata is optional (i.e. the client can use the received entity without them). Whenever the trailers are used, the server should list their names in the Trailer header field; three header field types are specifically prohibited from appearing as a trailer field: Content-Length, Trailer, and Transfer-Encoding. == Format == If a Transfer-Encoding field with a value of "chunked" is specified in an HTTP message (either a request sent by a client or the response from the server), the body of the message consists of one or more chunks and one terminating chunk with an optional trailer before the final ␍␊ sequence (i.e. carriage return followed by line feed). Each chunk starts with the number of octets of the data it embeds expressed as a hexadecimal number in ASCII followed by optional parameters (chunk extension) and a terminating ␍␊ sequence, followed by the chunk data. The chunk is terminated by ␍␊. If chunk extensions are provided, the chunk size is terminated by a semicolon and followed by the parameters, each also delimited by semicolons. Each parameter is encoded as an extension name followed by an optional equal sign and value. These parameters could be used for a running message digest or digital signature, or to indicate an estimated transfer progress, for instance. The terminating chunk is a special chunk of zero length. It may contain a trailer, which consists of a (possibly empty) sequence of entity header fields. Normally, such header fields would be sent in the message's header; however, it may be more efficient to determine them after processing the entire message entity. In that case, it is useful to send those headers in the trailer. Header fields that regulate the use of trailers are Transfer-Encoding with the "trailers" parameter (used in requests) and Trailer (used in responses). == Use with compression == HTTP servers often use compression to optimize transmission, for example with Content-Encoding: gzip or Content-Encoding: deflate. If both compression and chunked encoding are enabled, then the content stream is first compressed, then chunked; so the chunk encoding itself is not compressed, and the data in each chunk is compressed holistically (i.e. based on the whole content). The remote endpoint then decodes the stream by concatenating the chunks and uncompressing the result. == Example == === Encoded data === The following example contains three chunks of size 4, 7, and 11 (hexadecimal "B") octets of data. 4␍␊Wiki␍␊7␍␊pedia i␍␊B␍␊n ␍␊chunks.␍␊0␍␊␍␊ Below is an annotated version of the encoded data. 4␍␊ (chunk size is four octets) Wiki (four octets of data) ␍␊ (end of chunk) 7␍␊ (chunk size is seven octets) pedia i (seven octets of data) ␍␊ (end of chunk) B␍␊ (chunk size is eleven octets) n ␍␊chunks. (eleven octets of data) ␍␊ (end of chunk) 0␍␊ (chunk size is zero octets, no more chunks) ␍␊ (end of final chunk with zero data octets) Note: Each chunk's size excludes the two ␍␊ bytes that terminate the data of each chunk. === Decoded data === Decoding the above example produces the following octets: Wikipedia in ␍␊chunks. The bytes above are typically displayed as Wikipedia in chunks.

    Read more →
  • Cloud Data Management Interface

    Cloud Data Management Interface

    ISO/IEC 17826 Information technology — Cloud Data Management Interface (CDMI) Version 2.0.0 is an international standard that specifies a protocol for self-provisioning, administering and managing access to data stored in cloud storage, object storage, storage area network and network attached storage systems. The CDMI standard is developed and maintained by the Storage Networking Industry Association, who makes a publicly accessible version of the specification available. CDMI defines new resource representations to enable standardized management of any URI-accessible data, and defines RESTful HTTP operations using these representations to discover the capabilities of the storage system, discover stored data, access and update management metadata, specify data storage protocols (such as iSCSI and NFS) through which the stored data is accessed, and provide cross-system and cross-cloud import and export in order to enable data portability. Management functions enabled by CDMI include managing data ownership, identity mapping, access controls, user-specified metadata, and to declaratively specify desired data protection, data retention, constraints on geographic placement, desired quality of service, data versioning and security requirements. CDMI also defines utility services to facilitate data management, such the ability to query data matching specific criteria, and includes extensions to perform bulk updates using CDMI Jobs. == Capabilities == Compliant implementations must provide access to a set of configuration parameters known as capabilities. These are either boolean values that represent whether or not a system supports things such as queues, export via other protocols, path-based storage and so on, or numeric values expressing system limits, such as how much metadata may be placed on an object. As a minimal compliant implementation can be quite small, with few features, clients need to check the cloud storage system for a capability before attempting to use the functionality it represents. Resource allocation assignments limited to the data management interface protocols must possess access bypass capabilities which extend beyond the layered framework. This integral function is vital to the prevention of transport layer session hijacking by unauthorized entities which may circumvent standard interfacing security parameters. == Containers == A CDMI client may access objects, including containers, by either name or object id (OID), assuming the CDMI server supports both methods. When storing objects by name, it is natural to use nested named containers; the resulting structure corresponds exactly to a traditional filesystem directory structure. == Objects == Objects are similar to files in a traditional file system, but are enhanced with an increased amount and capacity for metadata. As with containers, they may be accessed by either name or OID. When accessed by name, clients use URLs that contain the full pathname of objects to create, read, update and delete them. When accessed by OID, the URL specifies an OID string in the cdmi-objectid container; this container presents a flat name space conformant with standard object storage system semantics. Subject to system limits, objects may be of any size or type and have arbitrary user-supplied metadata attached to them. Systems that support query allow arbitrary queries to be run against the metadata. == Domains, Users and Groups == CDMI supports the concept of a domain, similar in concept to a domain in the Windows Active Directory model. Users and groups created in a domain share a common administrative database and are known to each other on a "first name" basis, i.e. without reference to any other domain or system. Domains also function as containers for usage and billing summary data. == Access Control == CDMI exactly follows the ACL and ACE model used for file authorization operations by NFSv4. This makes it also compatible with Microsoft Windows systems. == Metadata == CDMI draws much of its metadata model from the XAM specification. Objects and containers have "storage system metadata", "data system metadata" and arbitrary user specified metadata, in addition to the metadata maintained by an ordinary filesystem (atime etc.). == Queries == CDMI specifies a way for systems to support arbitrary queries against CDMI containers, with a rich set of comparison operators, including support for regular expressions. == Queues == CDMI supports the concept of persistent FIFO (first-in, first-out) queues. These are useful for job scheduling, order processing and other tasks in which lists of things must be processed in order. == Compliance == Both retention intervals and retention holds are supported by CDMI. A retention interval consists of a start time and a retention period. During this time interval, objects are preserved as immutable and may not be deleted. A retention hold is usually placed on an object because of judicial action and has the same effect: objects may not be changed nor deleted until all holds placed on them are removed. == Billing == Summary information suitable for billing clients for on-demand services can be obtained by authorized users from systems that support it. == Serialization == Serialization of objects and containers allows export of all data and metadata on a system and importation of that data into another cloud system. == Foreign protocols == CDMI supports export of containers as NFS or CIFS shares. Clients that mount these shares see the container hierarchy as an ordinary filesystem directory hierarchy, and the objects in the containers as normal files. Metadata outside of ordinary filesystem metadata may or may not be exposed. Provisioning of iSCSI LUNs is also supported. == Client SDKs == CDMI Reference Implementation Droplet libcdmi-java libcdmi-python .NET SDK

    Read more →
  • Radical trust

    Radical trust

    Radical trust is the confidence that any structured organization, such as a government, library, business, religion, or museum, has in collaboration and empowerment within online communities. Specifically, it pertains to the use of blogs, wiki and online social networking platforms by organizations to cultivate relationships with an online community that then can provide feedback and direction for the organization's interest. The organization 'trusts' and uses that input in its management. One of the first appearances of the notion of radical trust appears in an info graphic outlining the base principles of web 2.0 in Tim O'Reilly's weblog post "What is Web 2.0". Radical Trust is listed as the guiding example of trusting the validity of consumer generated media. This concept is considered to be an underlying assumption of Library 2.0. The adoption of radical trust by a library would require its management let go of some of its control over the library and building an organization without an end result in mind. The direction a library would take would be based on input provided by people through online communities. These changes in the organization may merely be anecdotal in nature, making this method of organization management dramatically distinct from data-based or evidence based management. In marketing, Collin Douma further describes the notion of radical trust as a key mindset required for marketers and advertisers to enter the social media marketing space. Conventional marketing dictates and maintains control of messages to cause the greatest persuasion in consumer decisions, but Douma argued that in the social media space, brands would need to cede that control in order to build brand loyalty.

    Read more →
  • Right to explanation

    Right to explanation

    In the regulation of algorithms, particularly artificial intelligence and its subfield of machine learning, a right to [an] explanation is a right to be given an explanation for an output of the algorithm. Such rights primarily refer to individual rights to be given an explanation for decisions that significantly affect an individual, particularly legally or financially. For example, a person who applies for a loan and is denied may ask for an explanation, which could be "Credit bureau X reports that you declared bankruptcy last year; this is the main factor in considering you too likely to default, and thus we will not give you the loan you applied for." Some such legal rights already exist, while the scope of a general "right to explanation" is a matter of ongoing debate. There have been arguments made that a "social right to explanation" is a crucial foundation for an information society, particularly as the institutions of that society will need to use digital technologies, artificial intelligence, machine learning. In other words, that the related automated decision making systems that use explainability would be more trustworthy and transparent. Without this right, which could be constituted both legally and through professional standards, the public will be left without much recourse to challenge the decisions of automated systems. == Examples == === Credit scoring in the United States === Under the Equal Credit Opportunity Act (Regulation B of the Code of Federal Regulations), Title 12, Chapter X, Part 1002, §1002.9, creditors are required to notify applicants who are denied credit with specific reasons for the detail. As detailed in §1002.9(b)(2): (2) Statement of specific reasons. The statement of reasons for adverse action required by paragraph (a)(2)(i) of this section must be specific and indicate the principal reason(s) for the adverse action. Statements that the adverse action was based on the creditor's internal standards or policies or that the applicant, joint applicant, or similar party failed to achieve a qualifying score on the creditor's credit scoring system are insufficient. The official interpretation of this section details what types of statements are acceptable. Creditors comply with this regulation by providing a list of reasons (generally at most 4, per interpretation of regulations), consisting of a numeric reason code (as identifier) and an associated explanation, identifying the main factors affecting a credit score. An example might be: 32: Balances on bankcard or revolving accounts too high compared to credit limits === European Union === The European Union General Data Protection Regulation (GDPR, enacted 2016, taking effect 2018) extends the automated decision-making rights in the 1995 Data Protection Directive to provide a legally disputed form of a right to an explanation, stated as such in Recital 71: "[the data subject should have] the right ... to obtain an explanation of the decision reached". In full: The data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing and which produces legal effects concerning him or her or similarly significantly affects him or her, such as automatic refusal of an online credit application or e-recruiting practices without any human intervention. ... In any case, such processing should be subject to suitable safeguards, which should include specific information to the data subject and the right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision. However, the extent to which the regulations themselves provide a "right to explanation" is heavily debated. There are two main strands of criticism. There are significant legal issues with the right as found in Article 22 — as recitals are not binding, and the right to an explanation is not mentioned in the binding articles of the text, having been removed during the legislative process. In addition, there are significant restrictions on the types of automated decisions that are covered — which must be both "solely" based on automated processing, and have legal or similarly significant effects — which significantly limits the range of automated systems and decisions to which the right would apply. In particular, the right is unlikely to apply in many of the cases of algorithmic controversy that have been picked up in the media. The UK has also recently amended its implementation of Article 22. A second potential source of such a right has been pointed to in Article 15, the "right of access by the data subject". This restates a similar provision from the 1995 Data Protection Directive, allowing the data subject access to "meaningful information about the logic involved" in the same significant, solely automated decision-making, found in Article 22. Yet this too suffers from alleged challenges that relate to the timing of when this right can be drawn upon, as well as practical challenges that mean it may not be binding in many cases of public concern. Other EU legislative instruments contain explanation rights. The European Union's Artificial Intelligence Act provides in Article 86 a "[r]ight to explanation of individual decision-making" of certain high risk systems which produce significant, adverse effects to an individual's health, safety or fundamental rights. The right provides for "clear and meaningful explanations of the role of the AI system in the decision-making procedure and the main elements of the decision taken", although only applies to the extent other law does not provide such a right. The Digital Services Act in Article 27, and the Platform to Business Regulation in Article 5, both contain rights to have the main parameters of certain recommender systems to be made clear, although these provisions have been criticised as not matching the way that such systems work. The Platform Work Directive, which provides for regulation of automation in gig economy work as an extension of data protection law, further contains explanation provisions in Article 11, using the specific language of "explanation" in a binding article rather than a recital as is the case in the GDPR. Scholars note that remains uncertainty as to whether these provisions imply sufficiently tailored explanation in practice which will need to be resolved by courts. === France === In France the 2016 Loi pour une République numérique (Digital Republic Act or loi numérique) amends the country's administrative code to introduce a new provision for the explanation of decisions made by public sector bodies about individuals. It notes that where there is "a decision taken on the basis of an algorithmic treatment", the rules that define that treatment and its "principal characteristics" must be communicated to the citizen upon request, where there is not an exclusion (e.g. for national security or defence). These should include the following: the degree and the mode of contribution of the algorithmic processing to the decision- making; the data processed and its source; the treatment parameters, and where appropriate, their weighting, applied to the situation of the person concerned; the operations carried out by the treatment. Scholars have noted that this right, while limited to administrative decisions, goes beyond the GDPR right to explicitly apply to decision support rather than decisions "solely" based on automated processing, as well as provides a framework for explaining specific decisions. Indeed, the GDPR automated decision-making rights in the European Union, one of the places a "right to an explanation" has been sought within, find their origins in French law in the late 1970s. == Criticism == Some argue that a "right to explanation" is at best unnecessary, at worst harmful, and threatens to stifle innovation. Specific criticisms include: favoring human decisions over machine decisions, being redundant with existing laws, and focusing on process over outcome. Authors of study "Slave to the Algorithm? Why a 'Right to an Explanation' Is Probably Not the Remedy You Are Looking For" Lilian Edwards and Michael Veale argue that a right to explanation is not the solution to harms caused to stakeholders by algorithmic decisions. They also state that the right of explanation in the GDPR is narrowly defined, and is not compatible with how modern machine learning technologies are being developed. With these limitations, defining transparency within the context of algorithmic accountability remains a problem. For example, providing the source code of algorithms may not be sufficient and may create other problems in terms of privacy disclosures and the gaming of technical systems. To mitigate this issue, Edwards and Veale argue that an auditing system could be more effective, to allow auditors to loo

    Read more →
  • Human rights and encryption

    Human rights and encryption

    Human rights and encryption refers to the ways in which digital encryption affects human rights. Encryption can be used as both a detriment and a boon to human rights; for example, encryption can be used to enforce digital rights management for video games. This kind of video game licensing can render software unusable long term and represents the erosion of consumer rights. At the same time, encryption is fundamental part of internet security. Asymmetrical encryption is used extensively online for authentication, providing users confidence their internet traffic is not being misdirected. Encryption is also used to obfuscate information as it travels from end-to-end over the internet, preventing eavesdropping and tampering. Encryption can also provide anonymity, which is an important consideration for freedom of expression. Despite its drawbacks, encryption is essential for a free, open, and trustworthy internet. == Background == === Human rights === Human rights are moral principles or norms for human behaviour that are regularly protected as legal rights in national and international law. They are commonly understood as inalienable, fundamental rights "to which a person is inherently entitled simply because they are a human being". Those rights are "inherent in all human beings" regardless of their nationality, location, language, religion, ethnic origin, or any other status. They are applicable everywhere and at every time and are universal and egalitarian. === Cryptography === Cryptography is a long-standing subfield of both mathematics and computer science. It can generally be defined as "the protection of information and computation using mathematical techniques." Encryption and cryptography are closely interlinked, although "cryptography" has a broader meaning. For example, a digital signature is "cryptography", but not technically "encryption". == Overview == Under international human rights law, freedom of expression is recognized as a human right under Article 19 of the Universal Declaration of Human Rights (UDHR) and the International Covenant on Civil and Political Rights (ICCPR). In Article 19 of the UDHR states that "everyone shall have the right to hold opinions without interference" and "everyone shall have the right to freedom of expression; this right shall include freedom to seek, receive and impart information and ideas of all kinds, regardless of frontiers, either orally, in writing or in print, in the form of art, or through any other media of his choice". Since the 1970s, the availability of digital computing and the invention of public-key cryptography have made encryption more widely available. (Previously, encryption techniques were the domain of nation-state actors.) Cryptographic techniques are also used to protect the anonymity of communicating actors and privacy more generally. The availability and use of encryption continue to lead to complex, important, and highly contentious legal policy debates. Some government agencies have made statements or proposals to lessen such usage and deployment due to hurdles it presents for government access. The rise of commercial end-to-end encryption services have pushed towards more debates around the use of encryption and the legal status of cryptography in general. Encryption, as defined above, is a set of cryptographic techniques to protect information. The normative value of encryption, however, is not fixed but varies with the type and purpose of the cryptographic methods used. Traditionally, encryption (cipher) techniques were used to ensure the confidentiality of communications and prevent access to information and communications by others and intended recipients. Cryptography can also ensure the authenticity of communicating parties and the integrity of communications contents, providing a key ingredient for enabling trust in the digital environment. There is a growing awareness within human rights organizations that encryption plays an important role in realizing a free, open, and trustworthy Internet. UN Special Rapporteur on the promotion and protection of the right to freedom of opinion and expression David Kaye observed, during the Human Rights Council in June 2015, that encryption and anonymity deserve a protected status under the rights to privacy and freedom of expression: "Encryption and anonymity, today's leading vehicles for online security, provide individuals with a means to protect their privacy, empowering them to browse, read, develop and share opinions and information without interference and enabling journalists, civil society organizations, members of ethnic or religious groups, those persecuted because of their sexual orientation or gender identity, activists, scholars, artists and others to exercise the rights to freedom of opinion and expression." == Encryption in media and communication == In the context of media and communication, two types of encryption in media and communication can be distinguished: It could be used as a result of the choice of a service provider or deployed by Internet users. Client-side encryption tools and technologies are relevant for marginalized communities, journalists and other online media actors practicing journalism as a way of protecting their rights. It could prevent unauthorized third party access, but the service provider implementing it would still have access to the relevant user data. End-to-end encryption is an encryption technique that refers to encryption that also prevents service providers themselves from having access to the user's communications. The implementation of these forms of encryption has sparked the most debate since the start of the 21st century. === Service providers deployed techniques to prevent unauthorized third-party access. === Among the most widely deployed cryptographic techniques is the securitization of communications channel between internet users and specific service providers from man-in-the-middle attacks, access by unauthorized third parties. Given the breadth of nuances involved, these cryptographic techniques must be run jointly by both the service user and the service provider in order to work properly. They require service providers, including online news publisher(s) or social network(s), to actively implement them into service design. Users cannot deploy these techniques unilaterally; their deployment is contingent on active participation by the service provider. The TLS protocol, which becomes visible to the normal internet user through the HTTPS header, is widely used for securing online commerce, e-government services and health applications as well as devices that make up networked infrastructures, e.g., routers, cameras. However, although the standard has been around since 1990, the wider spread and evolution of the technology has been slow. As with other cryptographic methods and protocols, the practical challenges related to proper, secure and (wider) deployment are significant and have to be considered. Many service providers still do not implement TLS or do not implement it well. In the context of wireless communications, the use of cryptographic techniques that protect communications from third parties are also important. Different standards have been developed to protect wireless communications: 2G, 3G and 4G standards for communication between mobile phones, base stations and base stations controllers; standards to protect communications between mobile devices and wireless routers ('WLAN'); and standards for local computer networks. One common weakness in these designs is that the transmission points of the wireless communication can access all communications e.g., the telecommunications provider. This vulnerability is exacerbated when wireless protocols only authenticate user devices, but not the wireless access point. Whether the data is stored on a device, or on a local server as in the cloud, there is also a distinction between 'at rest'. Given the vulnerability of cellphones to theft for instance, particular attention may be given to limiting service provided access. This does not exclude the situation that the service provider discloses this information to third parties like other commercial entities or governments. The user needs to trust the service provider to act in their interests. The possibility that a service provider is legally compelled to hand over user information or to interfere with particular communications with particular users, remains. === Privacy-enhancing Technologies === There are services that specifically market themselves with claims not to have access to the content of their users' communication. Service Providers can also take measures that restrict their ability to access information and communication, further increasing the protection of users against access to their information and communications. The integrity of these Privacy Enhancing Technologies (PETs), depends on delicate design decisions as well as the

    Read more →
  • Undeniable signature

    Undeniable signature

    An undeniable signature is a digital signature scheme which allows the signer to be selective to whom they allow to verify signatures. The scheme adds explicit signature repudiation, preventing a signer later refusing to verify a signature by omission; a situation that would devalue the signature in the eyes of the verifier. It was invented by David Chaum and Hans van Antwerpen in 1989. == Overview == In this scheme, a signer possessing a private key can publish a signature of a message. However, the signature reveals nothing to a recipient/verifier of the message and signature without taking part in either of two interactive protocols: Confirmation protocol, which confirms that a candidate is a valid signature of the message issued by the signer, identified by the public key. Disavowal protocol, which confirms that a candidate is not a valid signature of the message issued by the signer. The motivation for the scheme is to allow the signer to choose to whom signatures are verified. However, that the signer might claim the signature is invalid at any later point, by refusing to take part in verification, would devalue signatures to verifiers. The disavowal protocol distinguishes these cases removing the signer's plausible deniability. It is important that the confirmation and disavowal exchanges are not transferable. They achieve this by having the property of zero-knowledge; both parties can create transcripts of both confirmation and disavowal that are indistinguishable, to a third-party, of correct exchanges. The designated verifier signature scheme improves upon deniable signatures by allowing, for each signature, the interactive portion of the scheme to be offloaded onto another party, a designated verifier, reducing the burden on the signer. == Zero-knowledge protocol == The following protocol was suggested by David Chaum. A group, G, is chosen in which the discrete logarithm problem is intractable, and all operation in the scheme take place in this group. Commonly, this will be the finite cyclic group of order p contained in Z/nZ, with p being a large prime number; this group is equipped with the group operation of integer multiplication modulo n. An arbitrary primitive element (or generator), g, of G is chosen; computed powers of g then combine obeying fixed axioms. Alice generates a key pair, randomly chooses a private key, x, and then derives and publishes the public key, y = gx. === Message signing === Alice signs the message, m, by computing and publishing the signature, z = mx. === Confirmation (i.e., avowal) protocol === Bob wishes to verify the signature, z, of m by Alice under the key, y. Bob picks two random numbers: a and b, and uses them to blind the message, sending to Alice: c = magb. Alice picks a random number, q, uses it to blind, c, and then signing this using her private key, x, sending to Bob: s1 = cgq ands2 = s1x. Note that s1x = (cgq)x = (magb)xgqx = (mx)a(gx)b+q = zayb+q. Bob reveals a and b. Alice verifies that a and b are the correct blind values, then, if so, reveals q. Revealing these blinds makes the exchange zero knowledge. Bob verifies s1 = cgq, proving q has not been chosen dishonestly, and s2 = zayb+q, proving z is valid signature issued by Alice's key. Note that zayb+q = (mx)a(gx)b+q. Alice can cheat at step 2 by attempting to randomly guess s2. === Disavowal protocol === Alice wishes to convince Bob that z is not a valid signature of m under the key, gx; i.e., z ≠ mx. Alice and Bob have agreed an integer, k, which sets the computational burden on Alice and the likelihood that she should succeed by chance. Bob picks random values, s ∈ {0, 1, ..., k} and a, and sends: v1 = msga and v2 = zsya, where exponentiating by a is used to blind the sent values. Note that v2 = zsya = (mx)s(gx)a = v1x. Alice, using her private key, computes v1x and then the quotient, v1xv2−1 = (msga)x(zsgxa)−1 = msxz−s = (mxz−1)s. Thus, v1xv2−1 = 1, unless z ≠ mx. Alice then tests v1xv2−1 for equality against the values: (mxz−1)i for i ∈ {0, 1, …, k}; which are calculated by repeated multiplication of mxz−1 (rather than exponentiating for each i). If the test succeeds, Alice conjectures the relevant i to be s; otherwise, she conjectures random value. Where z = mx, (mxz−1)i = v1xv2−1 = 1 for all i, s is unrecoverable. Alice commits to i: she picks a random r and sends hash(r, i) to Bob. Bob reveals a. Alice confirms that a is the correct blind (i.e., v1 and v2 can be generated using it), then, if so, reveals r. Revealing these blinds makes the exchange zero knowledge. Bob checks hash(r, i) = hash(r, s), proving Alice knows s, hence z ≠ mx. If Alice attempts to cheat at step 3 by guessing s at random, the probability of succeeding is 1/(k + 1). So, if k = 1023 and the protocol is conducted ten times, her chances are 1 to 2100.

    Read more →
  • IBM remote batch terminals

    IBM remote batch terminals

    The IBM 2780 and the IBM 3780 are devices developed by IBM for performing remote job entry (RJE) and other batch functions over telephone lines; they communicate with the mainframe via Binary Synchronous Communications (BSC or Bisync) and replaced older terminals using synchronous transmit-receive (STR). In addition, IBM has developed workstation programs for the 1130, 360/20, 2922, System/360 other than 360/20, System/370 and System/3. == 2780 Data Transmission Terminal == The 2780 Data Transmission Terminal first shipped in 1967. It consists of: A line printer similar to the IBM 1443 that can print up to 240 lines per minute (lpm), or 300 lpm using an extremely restricted character set. A card reader/punch unit, similar to an IBM 1442, that can read up to 400 cards per minute (cpm) and can punch up to 355 cpm. A line buffer that stores data received or to be transmitted over the communications line. A binary synchronous adapter which controls the flow of data over the communications line. The 2780 is capable of local (offline) card to print operation. It comes in four models: Model 1: Can read punched cards and transmit the data to a remote host computer, and can receive and print data sent by the host. Model 2: Same as Model 1 but adds the ability to punch card data received from the host. Model 3: Can only print data received from the host, but not send data to it. Model 4: Can read and punch card data, but has no printing capabilities. The 2780 uses a dedicated communication line at speeds of 1200, 2000, 2400 or 4800 bits per second. It is a half duplex device, although full duplex lines can be used with some increase in throughput. It can communicate in Transcode (a 6-bit code), 8-bit EBCDIC, or 7-bit ASCII. == 2770 Data Communication System == The 2770, announced in 1969, "was said to surpass all other IBM terminals in the variety of available input-output devices." The 2770 was developed by the IBM General Products Division (GPD) in Rochester, MN. It comes standard with a desktop terminal with keyboard. The printer and other devices (any two in any combination) can be attached to the 2772 Multi-Purpose Control unit. Possible devices include: 50 Magnetic Data Inscriber 545 Card Punch Model 3 (non-printing) or Model 4 (printing) 1017 Paper Tape Reader 1018 Paper Tape Punch 1053 Printer Model 1 1255 Magnetic Character Reader Models 1, 2 or 3 2203 Printer Model A1 or A2 2213 Printer Model 1 or 2 2265 Display Station Model 2 2502 Card Reader Model A1 or A2 5496 Data Recorder == 3780 Data Communications Terminal == In May 1972, IBM announced the IBM 3780, an enhanced version of the 2780. The 3780 was developed by IBM's Data Processing Division (DPD). There is one model, with an optional card punch. The 3780 drops Transcode support and incorporates several performance enhancements. It supports compression of blank fields in data using run-length encoding. It provides the ability to interleave data between devices, introduces double buffering, and adds support for the Wait-before-transmit ACKnowledgement (WACK) and Temporary Text Delay (TTD) Binary Synchronous control characters. The integrated punched card unit can read cards at 600 cards per minute. The integrated printer is rated at 300, 350 or 425 lines per minute based on characters set (63, 52 or 39 characters). The 3781 Card Punch is an optional feature. It punches 160 columns per second, or 91 cards per minute if all 80 columns are punched. The IBM 2780 and 3780 were later emulated on various types of equipment, including eventually the personal computer. A notable early emulation was the DN60, by Digital Equipment Corporation in the late 1970s. == 3770 Data Communications System == In 1974 IBM Data Processing Division (DPD) offered a successor to the 3780, called the 3770 Data Communications System, supporting SDLC, BSC, BSC Multi-leaving and SNA, depending on the configuration. The 3770 is a family of desk console style terminals that offers a variety of keyboard and printer combinations as well as I/O equipment attachment and communications features. The terminals come built into a desk and include the following models: 3771 Communication Terminal (optional card reader, optional card punch, wire matrix printer) Models 1 (40 cps printer), 2 (80 cps printer), and 3 (120 cps printer). 3773 Communication Terminal (diskette, wire matrix printer) Models 1 (40 cps printer), 2 (80 cps printer), and 3 (120 cps printer). Each model has a P version which adds some programming features. 3774 Communication Terminal (optional card reader, optional card punch, optional belt printer, wire matrix printer) Models 1 (80 cps printer), and 2 (120 cps printer). Each model has a P version which adds some programming features, a 480-character display and a non-removable diskette. 3775 Communication Terminal (optional card reader, optional card punch, optional diskette, belt printer) Model 1 (120 lpm printer). The model P1 adds some programming features, a 480-character display and a non-removable diskette. 3776 Communication Terminal (optional card reader, optional card punch, optional diskette, belt printer) Models 1 (300 lpm printer) and 2 (400 lpm printer). Models 3 and 4 are similar to models 1 and 2. 3777 Communication Terminal (optional card reader, optional diskette, train printer) Model 1 (up to 1000 lpm printer depending on character set). Model 2 adds an optional card punch, model 3 adds an optional magnetic tape drive and model 4 replaces the train printer with a slower model called the IBM 3262. The model 4 also allows a second, optional, 3262. The following I/O devices can be attached to a 3770 terminal: IBM 2502 Card Reader: Models A1 (up to 150 card per minute), A2 (up to 300 cards per minute) or A3 (up to 400 cards per minute) IBM 3203 Printer Model 3: 1000 LPM using 48 character set IBM 3501 Card Reader: Up to 50 cards per minute desktop unit IBM 3521 Card Punch: Up to 50 cards per minute IBM 3782 Card Attachment unit, which allows the 2502 or 3521 to be attached to any terminal except the 3777 IBM 3784 Line Printer, can be attached to a 3774 as a second printer. Up to 155 LPM with 48 characters set print belt. == Workstation programs == IBM distributes workstation programs with systems software including OS/360 Attached Support Processor (ASP) Houston Automatic Spooling Priority (HASP and HASP II) Operating System/Virtual Storage 1 (OS/VS1) Operating System/Virtual Storage 2 (OS/VS2 MVS) Release 2 through 3.8 MVS versions from MVS/SP Version 1 through z/OS Priority Output Writers, Execution processors and input Readers (POWER) Remote Spooling Communications Subsystem (RSCS) Except for the RJE workstation programs in OS/360, these programs use a variation of BSC known as Multi-leaving. In addition, IBM provides separately ordered workstation programs using BSC. Systems Network Architecture (SNA) and TCP/IP. Workstation programs are available from IBM and third-party vendors to support all of these protocols: 2770/3770 2780/3780 Multileaving Network Job Entry (NJE) OS/360 RJE SNA TCP/IP

    Read more →
  • GPTs

    GPTs

    GPTs are custom versions of ChatGPT with added instructions and extra knowledge. GPTs can be used and created from the GPT Store. Any user can easily create them without any programming knowledge. GPTs can be tailored for specific writing styles, topics, or tasks. The ability to create GPTs was introduced in November 2023, and by January 2024, more than 3 million GPTs had been published. == Features and uses == GPTs can be configured to answer complex questions in specific fields, solve problems, provide image-based information, or create digital content. They can be programmed as educational tools, purchasing guides, or technical advisors, as well as for many others applications. GPTs are accessed from the GPT Store section of the ChatGPT web page. The “Explore GPT” link opens the store where the most popular GPTs in each section are highlighted. The GPTs are organized by categories. The store also uses a rating system based on user experiences similar to that used by other app stores such as Apple's App Store or Google Play. Those with the best ratings appear at the top of each category. According to La Vanguardia, the most popular categories are: Personal assistants Learning to program Image generation Creative writing Gaming Entertainment It is expected that in the future the creators of GPTs will be able to monetize them. Companies like Moderna are using GPTs to assist in various specific business tasks. The company has created 750 GPTs for its own internal use. == Configuration == Creating GPTs does not require prior programming knowledge. Free users can use existing GPTs but cannot create their own. Paying subscribers can use the editor on the ChatGPT site to configure the GPT's name, image and description, instructions and access to APIs, along with visibility options. == Criticism == The implementation and use of GPTs has not been without criticism. The GPT Store has been criticized for the proliferation of low-quality GPTs and spam due to a lack of effective moderation. There are also concerns about data privacy and security, as GPTs may collect and use personal information in ways that are not always transparent to users.

    Read more →
  • Business intelligence

    Business intelligence

    Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information to inform business strategies and business operations. Common functions of BI technologies include reporting, online analytical processing, analytics, dashboard development, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics, and prescriptive analytics. BI tools can handle large amounts of structured and sometimes unstructured data to help organizations identify, develop, and otherwise create new strategic business opportunities. They aim to allow for the easy interpretation of these big data. Identifying new opportunities and implementing an effective strategy based on insights is assumed to potentially provide businesses with a competitive market advantage and long-term stability, and help them take strategic decisions. Business intelligence can be used by enterprises to support a wide range of business decisions ranging from operational to strategic. Basic operating decisions include product positioning or pricing. Strategic business decisions involve priorities, goals, and directions at the broadest level. In all cases, business intelligence is considered most effective when it combines data from the market in which a company operates (external data) with data from internal company sources, such as financial and operational information. When integrated, external and internal data provide a comprehensive view that creates ‘intelligence’ not possible from any single data source alone. Among their many uses, business intelligence tools empower organizations to gain insight into new markets, to assess demand and suitability of products and services for different market segments, and to gauge the impact of marketing efforts. BI applications use data gathered from a data warehouse (DW) or from a data mart, and the concepts of BI and DW combine as "BI/DW" or as "BIDW". A data warehouse contains a copy of analytical data that facilitates decision support. == History == The earliest known use of the term business intelligence is in Richard Millar Devens' Cyclopædia of Commercial and Business Anecdotes (1865). Devens used the term to describe how the banker Sir Henry Furnese gained profit by receiving and acting upon information about his environment, prior to his competitors: Throughout Holland, Flanders, France, and Germany, he maintained a complete and perfect train of business intelligence. The news of the many battles fought was thus received first by him, and the fall of Namur added to his profits, owing to his early receipt of the news. The ability to collect and react accordingly based on the information retrieved, Devens says, is central to business intelligence. When Hans Peter Luhn, a researcher at IBM, used the term business intelligence in an article published in 1958, he employed the Webster's Dictionary definition of intelligence: "the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal." In 1989, Howard Dresner (later a Gartner analyst) proposed business intelligence as an umbrella term to describe "concepts and methods to improve business decision making by using fact-based support systems." It was not until the late 1990s that this usage was widespread. == Definition == According to Solomon Negash and Paul Gray, business intelligence (BI) can be defined as systems that combine: Data gathering Data storage Knowledge management with analysis to evaluate complex corporate and competitive information for presentation to planners and decision makers, with the objective of improving the timeliness and the quality of the input to the decision process." According to Forrester Research, business intelligence is "a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insights and decision-making." Under this definition, business intelligence encompasses information management (data integration, data quality, data warehousing, master-data management, text- and content-analytics, et al.). Therefore, Forrester refers to data preparation and data usage as two separate but closely linked segments of the business-intelligence architectural stack. Some elements of business intelligence are: Multidimensional aggregation and allocation Denormalization, tagging, and standardization Realtime reporting with analytical alert A method of interfacing with unstructured data sources Group consolidation, budgeting, and rolling forecasts Statistical inference and probabilistic simulation Key performance indicators optimization Version control and process management Open item management Forrester distinguishes this from the business-intelligence market, which is "just the top layers of the BI architectural stack, such as reporting, analytics, and dashboards." === Compared with competitive intelligence === Though the term business intelligence is sometimes a synonym for competitive intelligence (because they both support decision making), BI uses technologies, processes, and applications to analyze mostly internal, structured data and business processes while competitive intelligence gathers, analyzes, and disseminates information with a topical focus on company competitors. If understood broadly, competitive intelligence can be considered as a subset of business intelligence. === Compared with business analytics === Business intelligence and business analytics are sometimes used interchangeably, but there are alternate definitions. Thomas Davenport, professor of information technology and management at Babson College argues that business intelligence should be divided into querying, reporting, Online analytical processing (OLAP), an "alerts" tool, and business analytics. In this definition, business analytics is the subset of BI focusing on statistics, prediction, and optimization, rather than the reporting functionality. == Unstructured data == Business operations can generate a very large amount of data in the form of emails, memos, notes from call centers, news, user groups, chats, reports, web pages, presentations, image files, video files, and marketing material. According to Merrill Lynch, more than 85% of all business information exists in these forms; a company might only use such a document a single time. Because of the way it is produced and stored, this information is either unstructured or semi-structured. The management of semi-structured data is an unsolved problem in the information technology industry. According to projections from Gartner (2003), white-collar workers spend 30–40% of their time searching, finding, and assessing unstructured data. BI uses both structured and unstructured data. The former is easy to search, and the latter contains a large quantity of the information needed for analysis and decision-making. Because of the difficulty of properly searching, finding, and assessing unstructured or semi-structured data, organizations may not draw upon these vast reservoirs of information, which could influence a particular decision, task, or project. This can ultimately lead to poorly informed decision-making. Therefore, when designing a business intelligence/DW solution, the specific problems associated with semi-structured and unstructured data must be accommodated, as well as those associated with structured data. === Limitations of semi-structured and unstructured data === There are several challenges to developing BI with semi-structured data. According to Inmon & Nesavich, some of those are: Physically accessing unstructured textual data – unstructured data is stored in a huge variety of formats. Terminology – Among researchers and analysts, there is a need to develop standardized terminology. Volume of data – As stated earlier, up to 85% of all data exists as semi-structured data. Couple that with the need for word-to-word and semantic analysis. Searchability of unstructured textual data – A simple search on some data, e.g. apple, results in links where there is a reference to that precise search term. (Inmon & Nesavich, 2008) gives an example: "a search is made on the term felony. In a simple search, the term felony is used, and everywhere there is a reference to felony, a hit to an unstructured document is made. But a simple search is crude. It does not find references to crime, arson, murder, embezzlement, vehicular homicide, and such, even though these crimes are types of felonies". === Metadata === To solve problems with searchability and assessment of data, it is necessary to know something about the content. This can be done by adding context through the use of metadata. Many systems already capture some metadata (e.g. filename, author, size, etc.), but more usef

    Read more →
  • Kurzsignale

    Kurzsignale

    The Short Signal Code, also known as the Short Signal Book (German: Kurzsignalbuch), was a short code system used by the Kriegsmarine (German Navy) during World War II to minimize the transmission duration of messages. == Description == The transmission of radio messages had the potential risks of revealing the submarine's presence and direction; if decoded the content was also revealed. Submarines need to provide information, mostly in standard form (position of convoy to attack and of submarine, weather information), to their bases. Initially Morse code transmissions could be used. To inhibit detection, the duration of messages needed to be minimised; for this, Kurzsignale short-coding was used. To prevent interception, messages needed to be encrypted by the Enigma machine. To shorten transmission even further, the message could be sent by a fast machine instead of a human radio operator. For example, the Kurier system – not implemented in time – decreased the time to send a Morse dot from around 50 milliseconds for a human to 1 millisecond. == Short Signal book == The Kurzsignale code was intended to shorten transmission time to below the time required to get a directional fix. It was not primarily intended to hide signal contents; protection was intended to be achieved by encoding with the Enigma machine. A copy of the Kurzsignale code book was captured from German submarine U-110 on 9 May 1941. In August 1941, Dönitz began addressing U-boats by the names of their commanders, instead of boat numbers. The method of defining U-boat meeting points in the Short Signal Book was regarded as compromised, so a method was defined by B-Dienst cryptanalysts to disguise their positions on the Kriegsmarine German Naval Grid System (German:Gradnetzmeldeverfahren) was introduced and used until the end of the war == Radio direction finding == Aware of the danger presented by radio direction finding (RDF), the Kriegsmarine developed various systems to speed up broadcast. The Kurzsignale code system condensed messages into short codes consisting of short sequences for common terms such as "convoy location" so that additional descriptions would not be needed in the message. The resulting Kurzsignal was then encoded with the Enigma machine and subsequently transmitted as rapidly as possible, typically taking about 20 seconds. Typical length of an information or weather signal was about 25 characters. Conventional RDF needed about a minute to fix the bearing of a radio signal, and the Kurzsignale protected against this. However, the huff-duff system which was in use by the Allies could cope with these short transmissions. The fully automated burst transmission Kurier system, in testing from August 1944, could send a Kurzsignal in not more than 460 milliseconds; this was short enough to prevent location even by huff-duff and, if deployed, would have been a serious setback for Allied anti-submarine and code-breaking activities. By late 1944 the Kurier program was a top priority, but the war ended before the system was operational. == Short Weather cipher == A similar coding system was used for weather reports from U-boats, the Wetterkurzschlüssel (Short Weather Cipher). Code books were captured from U-559 on 30 October 1942.

    Read more →
  • Microsoft Security Development Lifecycle

    Microsoft Security Development Lifecycle

    The Microsoft Security Development Lifecycle (SDL) is the approach Microsoft uses to integrate security into DevOps processes (sometimes called a DevSecOps approach). You can use this SDL guidance and documentation to adapt this approach and practices to your organization. == Overview == The practices outlined in the SDL approach are applicable to all types of software development and across all platforms, ranging from traditional waterfall methodologies to modern DevOps approaches. They can generally be applied to the following: Software – whether you are developing software code for firmware, AI applications, operating systems, drivers, IoT Devices, mobile device apps, web services, plug-ins or applets, hardware microcode, low-code/no-code apps, or other software formats. Note that most practices in the SDL are applicable to secure computer hardware development as well. Platforms – whether the software is running on a ‘serverless’ platform approach, on an on-premises server, a mobile device, a cloud hosted VM, a user endpoint, as part of a Software as a Service (SaaS) application, a cloud edge device, an IoT device, or anywhere else. == Practices == The SDL recommends 10 security practices to incorporate into your development workflows. Applying the 10 security practices of SDL is an ongoing process of improvement so a key recommendation is to begin from some point and keep enhancing as you proceed. This continuous process involves changes to culture, strategy, processes, and technical controls as you embed security skills and practices into DevOps workflows. The 10 SDL practices are: Establish security standards, metrics, and governance Require use of proven security features, languages, and frameworks Perform security design review and threat modeling Define and use cryptography standards Secure the software supply chain Secure the engineering environment Perform security testing Ensure operational platform security Implement security monitoring and response Provide security training == Versions ==

    Read more →
  • Apache CarbonData

    Apache CarbonData

    Apache CarbonData is a free and open-source column-oriented data storage format of the Apache Hadoop ecosystem. It is similar to the other columnar-storage file formats available in Hadoop namely RCFile and ORC. It is compatible with most of the data processing frameworks in the Hadoop environment. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk. == History == CarbonData was developed at Huawei in 2013. The project was donated to the Apache Community in 2015 submitted to the Apache Incubator in June 2016. The project won top honors in the BlackDuck 2016 Open Source Rookies of the Year's Big Data category. Apache CarbonData has been a top-level Apache Software Foundation (ASF)-sponsored project since May 1, 2017.

    Read more →
  • Menu hack

    Menu hack

    A menu hack is a non-standard method of ordering food, usually at fast-food or fast casual restaurants, that offers a different result than what is explicitly stated on a menu. Menu hacks may range from a simple alternate flavor to "gaming the system" in order to obtain more food than normal. They are often spread on social media platforms such as TikTok, and are more popular with Generation Z, which has been known to customize their orders more than previous generations. Hacks are sometimes officially added to the menu after their popularity grows. However, in some cases, they have been criticized for overburdening fast food employees with outlandish requests, sparking debate as to whether certain menu hacks are unethical. The list of all possible menu hacks is called a secret menu. == History == The term "menu hack" stems from hacker culture and its tradition of overcoming previously imposed limitations. However, the tradition of ordering from a secret menu dates back to the early days of fast food. "Animal style" fries, a word of mouth menu item ordered from In-N-Out since the 1960s, was rumored to have been created by local surfers. In the Information Age, the rise of social media gave influencers the ability to communicate unique food combinations to their followers, which proved to go viral easily. Design mistakes in food ordering apps also proved to be easily exploitable. In some cases, these hacks boosted the profile of brands on social media, while in others, they caused financial harm when the company was unprepared to handle the sudden influx of unusual orders. One restaurant chain notable for the phenomenon is Chipotle Mexican Grill. A viral hack from Alexis Frost, suggesting a quesadilla with fajita vegetables inside, dipped in Chipotle vinaigrette mixed with sour cream, obtained 1.9 million views on TikTok, overloading the chain's workers, who had to work harder to prepare more vegetables and vinaigrette. Some restaurants began to deny the dish to customers, forcing them to only order meat and cheese on quesadillas. The company ultimately left the dish on the menu, but urged customers to stop ordering it via social media. When it later officially added the Fajita Quesadilla to the menu, digital sales nearly doubled. A method to order nachos, which are not officially on the menu, was also noted by customers. Starbucks is also famous for menu hacks, including the Pink Drink, a "Barbiecore" beverage in which coconut milk replaced the water in the strawberry açaí refresher. After it went viral, the company made it a permanent menu item and distributed it bottled in grocery stores. == Controversy == Menu hacks have been subject to a growing backlash, with employees stating that they "dread" younger customers due to the proliferation of unusual orders. Service industry workers, already overworked and underpaid, have called the rise of menu hacks and their difficulty to make an additional reason to unionize and demand higher wages.

    Read more →
  • HTTP Strict Transport Security

    HTTP Strict Transport Security

    HTTP Strict Transport Security (HSTS) is a policy mechanism that helps to protect websites against man-in-the-middle attacks such as protocol downgrade attacks and cookie hijacking. It allows web servers to declare that web browsers (or other complying user agents) should automatically interact with it using only HTTPS connections, which provide Transport Layer Security (TLS/SSL), unlike the insecure HTTP used alone. HSTS is an IETF standards track protocol and is specified in RFC 6797. The HSTS Policy is communicated by the server to the user agent via an HTTP response header field named Strict-Transport-Security. HSTS Policy specifies a period of time during which the user agent should only access the server in a secure fashion. Websites using HSTS often do not accept clear text HTTP, either by rejecting connections over HTTP or systematically redirecting users to HTTPS (though this is not required by the specification). The consequence of this is that a user-agent not capable of doing TLS will not be able to connect to the site. The protection normally only applies after a user has visited the site at least once, relying on the principle of "trust on first use". The way this protection works is that when a user entering or selecting an HTTP (not HTTPS) URL to the site, the client, such as a Web browser, will automatically upgrade to HTTPS without making an HTTP request, thereby preventing any HTTP man-in-the-middle attack from occurring. To counteract this problem, an HSTS preload list maintained by Google Chrome and used by other major web browsers is maintained. If a domain is on this list, the browser skips the initial request and encrypts all communication immediately. Additional domains can be registered at no cost. == Specification history == The HSTS specification was published as RFC 6797 on 19 November 2012 after being approved on 2 October 2012 by the IESG for publication as a Proposed Standard RFC. The authors originally submitted it as an Internet Draft on 17 June 2010. With the conversion to an Internet Draft, the specification name was altered from "Strict Transport Security" (STS) to "HTTP Strict Transport Security", because the specification applies only to HTTP. The HTTP response header field defined in the HSTS specification however remains named "Strict-Transport-Security". The last so-called "community version" of the then-named "STS" specification was published on 18 December 2009, with revisions based on community feedback. The original draft specification by Jeff Hodges from PayPal, Collin Jackson, and Adam Barth was published on 18 September 2009. The HSTS specification is based on original work by Jackson and Barth as described in their paper "ForceHTTPS: Protecting High-Security Web Sites from Network Attacks". Additionally, HSTS is the realization of one facet of an overall vision for improving web security, put forward by Jeff Hodges and Andy Steingruebl in their 2010 paper The Need for Coherent Web Security Policy Framework(s). == HSTS mechanism overview == A server implements an HSTS policy by supplying a header over an HTTPS connection (HSTS headers over HTTP are ignored). For example, a server could send a header such that future requests to the domain for the next year (max-age is specified in seconds; 31,536,000 is equal to one non-leap year) use only HTTPS: Strict-Transport-Security: max-age=31536000. When a web application issues HSTS Policy to user agents, conformant user agents behave as follows: Automatically turn any insecure links referencing the web application into secure links (e.g. http://example.com/some/page/ will be modified to https://example.com/some/page/ before accessing the server). If the security of the connection cannot be ensured (e.g. the server's TLS certificate is not trusted), the user agent must terminate the connection and should not allow the user to access the web application. This helps protect web application users against some passive (eavesdropping) and active network attacks. A man-in-the-middle attacker has a greatly reduced ability to intercept requests and responses between a user and a web application server while the user's browser has HSTS Policy in effect for that web application. == Applicability == The most important security vulnerability that HSTS can fix is SSL-stripping man-in-the-middle attacks, first publicly introduced by Moxie Marlinspike in his 2009 BlackHat Federal talk "New Tricks For Defeating SSL In Practice". The SSL (and TLS) stripping attack works by transparently converting a secure HTTPS connection into a plain HTTP connection. The user can see that the connection is insecure, but crucially there is no way of knowing whether the connection should be secure. At the time of Marlinspike's talk, many websites did not use TLS/SSL, therefore there was no way of knowing (without prior knowledge) whether the use of plain HTTP was due to an attack, or simply because the website had not implemented TLS/SSL. Additionally, no warnings are presented to the user during the downgrade process, making the attack fairly subtle to all but the most vigilant. Marlinspike's sslstrip tool, presented at Black Hat DC 2009, fully automates the attack. HSTS addresses this problem by informing the browser that connections to the site should always use TLS/SSL. The HSTS header can be stripped by the attacker if this is the user's first visit. Google Chrome, Mozilla Firefox, Internet Explorer, and Microsoft Edge attempt to limit this problem by including a "pre-loaded" list of HSTS sites. Unfortunately this solution cannot scale to include all websites on the internet. See limitations, below. HSTS can also help to prevent having one's cookie-based website login credentials stolen by widely available tools such as Firesheep. Because HSTS is time limited, it is sensitive to attacks involving shifting the victim's computer time e.g. using false NTP packets. == Limitations == The initial request remains unprotected from active attacks if it uses an insecure protocol such as plain HTTP or if the URI for the initial request was obtained over an insecure channel. The same applies to the first request after the activity period specified in the advertised HSTS Policy max-age (sites should set a period of several days or months depending on user activity and behavior). === Solutions with preload list === Google Chrome, Mozilla Firefox, and Internet Explorer/Microsoft Edge address this limitation by implementing a "HSTS preloaded list", which is a list that contains known sites supporting HSTS. This list is distributed with the browser so that it uses HTTPS for the initial request to the listed sites as well. As previously mentioned, these pre-loaded lists cannot scale to cover the entire Web. A potential solution might be achieved by using DNS records to declare HSTS Policy, and accessing them securely via DNSSEC, optionally with certificate fingerprints to ensure validity (which requires running a validating resolver to avoid last mile issues). Junade Ali has noted that HSTS is ineffective against the use of false domains; by using DNS-based attacks, it is possible for a man-in-the-middle interceptor to serve traffic from an artificial domain which is not on the HSTS Preload list, this can be made possible by DNS Spoofing Attacks, or simply a domain name that misleadingly resembles the real domain name such as www.example.org instead of www.example.com. Even with an HSTS preloaded list, HSTS cannot prevent advanced attacks against TLS itself, such as the BEAST or CRIME attacks introduced by Juliano Rizzo and Thai Duong. Attacks against TLS itself are orthogonal to HSTS policy enforcement. Neither can it protect against attacks on the server - if someone compromises it, it will happily serve any content over TLS. === Privacy issues === HSTS can be used to near-indelibly tag visiting browsers with recoverable identifying data (supercookies) which can persist in and out of browser "incognito" privacy modes. By creating a web page that makes multiple HTTP requests to selected domains, for example, if twenty browser requests to twenty different domains are used, theoretically over one million visitors can be distinguished (220) due to the resulting requests arriving via HTTP vs. HTTPS; the latter being the previously recorded binary "bits" established earlier via HSTS headers. == Browser support == Chromium and Google Chrome since version 4.0.211.0 Firefox since version 4; with Firefox 17, Mozilla integrates a list of websites supporting HSTS. Opera since version 12 Safari since OS X Mavericks (version 10.9, late 2013) Internet Explorer 11 on Windows 8.1 and Windows 7 with KB3058515 installed (Released as a Windows Update in June 2015) Microsoft Edge and Internet Explorer 11 on Windows 10 BlackBerry 10 Browser and WebView since BlackBerry OS 10.3.3. == Deployment best practices == Depending on the actual deployment there are certain threats (e.g. cookie injection attacks) t

    Read more →