[TF-AIDN] meeting notes from 19 April 2017 - final call for publication of guidelines and Urdu LGR

Sarmad Hussain sarmad.hussain at icann.org
Mon May 1 14:02:01 CEST 2017


Thanks Raed.

Let’s hear feedback from other members.  If TF members are ok, we can take forward the suggestion of defining implicit code point variants as you propose.

Regards,
Sarmad


From: Raed Al-Fayez [mailto:rfayez at citc.gov.sa]
Sent: Thursday, April 27, 2017 9:32 AM
To: Sarmad Hussain <sarmad.hussain at icann.org>; TF-AIDN <tf-aidn at meswg.org>
Subject: [Ext] RE: [TF-AIDN] meeting notes from 19 April 2017 - final call for publication of guidelines and Urdu LGR

Dear Sarmad, All,

First of all sorry for my late reply .. I was so busy in last couple of days.

In our opinion, for a registry it is so important to: (a) Secure the registry domain space and domain names, (b) Make sure the clients domains are usable and reachable (by allowing the registrant to activate the needed variants - we call this international reachability) and (c) Respect the supported language repertoire (by only accepting the code points that represent the support language).

From our point of view all of them are important but for the language community the last one is the most important one because it will be difficult to add code points in the LGR (as <char> tag in the XML) that is not part of the language even if this may be needed for anything else (e.g. ensure variant transitivity). Additionally, this match one of the "design goals" of RFC7940 (section 2) that stated:
   o  An LGR needs to be able to express the set of valid code points
      that are allowed for registration under a specific administrator's
      policies.


Back now to variant transitivity issue: I think if an LGR contained a variant relation between two code points only in one direction then the other direction should assumed to be there with blocked type even if not stated by the LGR because of variant transitivity. In other words, no need to change the variant transitivity requirement, we only clarify that if an LGR does not capture all direction of variant`s relations then the missing ones will be considered existing with a blocked type.

Hope this solve the issue.

Raed


From: Sarmad Hussain [mailto:sarmad.hussain at icann.org]
Sent: Sunday, April 23, 2017 3:52 PM
To: Raed Al-Fayez; TF-AIDN
Subject: RE: [TF-AIDN] meeting notes from 19 April 2017 - final call for publication of guidelines and Urdu LGR

Dear Raed, All,

Thanks for the revised version.

Kindly note that the need of “out of repertoire” code points comes from the principle for variants to be symmetric: If A is a variant of B, then B is a variant of A.  The two statements would require both A and B to be included in the repertoire as per RFC 7940 (the tool only checks for the requirement).

If a change to the symmetry principle is being suggested, it will be useful to propose the specific change – so that the implications can be analysed.  Taking out the principle of symmetry may have significant implications.  For example, without symmetry requirements, it would be possible to define ا to be a variant of آ but not vice versa.

Is there another solution?

Regards,
Sarmad

From: Raed Al-Fayez [mailto:rfayez at citc.gov.sa]
Sent: Sunday, April 23, 2017 5:20 PM
To: Sarmad Hussain <sarmad.hussain at icann.org<mailto:sarmad.hussain at icann.org>>; TF-AIDN <tf-aidn at meswg.org<mailto:tf-aidn at meswg.org>>
Subject: [Ext] RE: [TF-AIDN] meeting notes from 19 April 2017 - final call for publication of guidelines and Urdu LGR

Dear Sarmad,

I don't like the idea of adding "out-of-repertoire" code points to the Arabic language because of the ICANN tool while the RFC does not prohibit it. I believe that transitivity is for variants and not code points and we can have variants that are not part of our repertoire.

Anyway I have simplified and generalized the variant rules as requested in our last phone call meeting (see the attached file).

Finally, I highly recommend to submit the Arabic and Urdu LGR together especially that Arabic LGR was ready long time ago and the recently (optional) requested changes was requested in our last meeting.

Raed


From: tf-aidn-bounces at meswg.org<mailto:tf-aidn-bounces at meswg.org> [mailto:tf-aidn-bounces at meswg.org] On Behalf Of Sarmad Hussain
Sent: Friday, April 21, 2017 11:02 AM
To: TF-AIDN
Subject: [تم حجب المرفق] [TF-AIDN] meeting notes from 19 April 2017 - final call for publication of guidelines and Urdu LGR

Dear All,

We had a very fruitful discussion during our last online call on the earlier items being discussed over email (see trailing emails).

In the context of the questions raised for the Urdu language LGR, it was agreed that the language community may finalize the variant sets based on its own requirements.  A script level analysis should be included at the time of integration and is not necessary for language level analysis (the decision is left to the relevant language community).  However, if the language community is not addressing script level analysis, this may be pointed out in the description of the LGR they produce.  This will allow those using the LGR in conjunction with other Arabic script LGRs, to have appropriate integration guidance.  Principle six in our document is updated accordingly.

Regarding the inclusion of out-of-repertoire code points in the Arabic language LGR to make it symmetric and transitive, a question was raised during the discussion on whether this is needed by the RFC 7940.  I have connected with an author of the RFC, who responded that the RFC is mainly focused on defining how you write down an LGR in XML, not on how to design an LGR.  Thus this requirement is not expected to be part of the RFC 7940.  This is a design principle which we follow for the LGRs, as fifth principle for variants in our (attached) document: “Variant sets are symmetric (if A is a variant of B, then vice versa is also true) and transitive (if A is a variant of B and B is a variant of C, then A is also a variant of C).”  This has been handled in two ways.  Where there are out-of-repertoire code points which do not have variants, they are excluded through an “excluded-cp” rule;  where such code points have variants, these are managed by defining reflexive-variants (see the Armenian example I shared earlier; also see https://datatracker.ietf.org/doc/draft-freytag-lager-variant-rules/[datatracker.ietf.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dfreytag-2Dlager-2Dvariant-2Drules_&d=DwMGaQ&c=FmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM&r=KTETvEaGPwPcawI-QmNa-kiv-ZBvdgyyLm-mxd028M4&m=-mzrsBLHo6rUnmnFgP6GIZFPCiD1TPfdupvTVHU2VBY&s=JOfO9BJZuss3W401C2mri3evmzzYE7pqRPJMn3sy7QQ&e=>).  But, to keep the symmetry property, the out-of-repertoire code points should be listed in the repertoire.

Complexity introduced due to the conditional variants was also discussed.  The Arabic language WG agreed to review the Arabic language LGR to simplify such cases.  As per the discussion, this suggestion has been added to the first variant principle as well.

The revised guidelines are also attached, with changes in redline.  This version and the Urdu LGR are now ready for publication, if there is no further feedback.  We plan to do so by mid next week.  Please share any final comments on these by Tuesday 24 April.

We will publish the Arabic language LGR for public comments once it has been revised by the WG and discussed by TF-AIDN.

The UA WG is also requested to share the final report for publication.

Regards,
Sarmad






From: Sarmad Hussain
Sent: Wednesday, April 19, 2017 3:08 PM
To: TF-AIDN <tf-aidn at meswg.org<mailto:tf-aidn at meswg.org>>
Subject: RE: [TF-AIDN] Materials to consider and discuss for TF-AIDN Public comment release

Thanks Zied and Raed for your feedback.

For Urdu, both of you are suggesting that we keep the letters distinct in Urdu and merge them into a variant set during integration at script level, due to the variant constraint in Arabic language table.  We will discuss and finalize during our call.  We may have to update the text in our guidelines document to reflect this.

For the Arabic language LGR, I have uploaded the version circulated and generated a summary, which tests for transitivity and symmetry.  The summary report is attached for you to review the current gaps being reported by the tool.  Please note that if a code point is defined as a variant, it should be included in the repertoire as well.  The way to keep them “out of repertoire” is indicated by the example of Armenian shared earlier (by making them blocked self-variants).

Regarding the simplification of conditional variants, as an example, consider the following:

0647<file:///C:/Users/rfayez.CITC/Desktop/LGR%20SaudiNIC/lgr-second-level-arabic-en-Fixed_By_TFAIDN-23jun2016.htm#0647>

ه

06BE

ھ

→

activated

for international reachability and only in initial & medial positions for international reachability

→

blocked

Only in final & isolated positions


One way to simplify would be to make this variant activated unconditionally.  Of course there will be a  trade-off between over-production and simplicity in such cases.

Regards,
Sarmad


From: Raed Al-Fayez [mailto:rfayez at citc.gov.sa]
Sent: Wednesday, April 19, 2017 2:07 PM
To: Sarmad Hussain <sarmad.hussain at icann.org<mailto:sarmad.hussain at icann.org>>; TF-AIDN <tf-aidn at meswg.org<mailto:tf-aidn at meswg.org>>
Cc: Raed Al-Fayez <rfayez at citc.gov.sa<mailto:rfayez at citc.gov.sa>>
Subject: [Ext] RE: [TF-AIDN] Materials to consider and discuss for TF-AIDN Public comment release

Dear Sarmad,

Thanks for the reviewing the TF work.

Regarding the principle issue you raised in your email and as whet we have decided in Istanbul meeting: if the Urdu users can't see the similarity between the code points you mentioned then they should not be document it in the Urdu LGR. However, for us as Arabic users if we see them variant then we need to add them in our Arabic LGR. Later, when we combine the two LGR together (and based on the rules we decided in Istanbul) they will be variants and this will secure the domain name space for the TLD registry. If we don't do that then the possibility of phishing will be high for the Arabic users and this will possibly go against RFC 6912’s "Contextual Safety Principle" and "least astonishment principle" from the Arabic user perspectives.

Regarding the technical issues with Arabic language LGR, I have tested the XML file for the Arabic LGR in the ICANN LGR tool[lgrtool.icann.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__lgrtool.icann.org_&d=DwMGaQ&c=FmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM&r=KTETvEaGPwPcawI-QmNa-kiv-ZBvdgyyLm-mxd028M4&m=_UVlGKHcG0xlnNnyG2oaNtp6qLqrKeGlllZJwLjr6yM&s=nTd8mnR3bgfkFIYm0_fV8WfvP_PYM1j7lKhQRt2QLTc&e=> and every think seems to be OK.

Also we (on-purpose) have not added 06F0-06F9 in the repertoire and at the same time we want them to be variant of Arabic Digits 0660-0669. In other words, we will not allow them to be registered but we will allow them to be allocated for reachability purposes.

Finlay, it is not clear to me what do you mean by simplifying the Heh conditional variant rules?

Raed

From: tf-aidn-bounces at meswg.org<mailto:tf-aidn-bounces at meswg.org> [mailto:tf-aidn-bounces at meswg.org] On Behalf Of Sarmad Hussain
Sent: Sunday, April 16, 2017 1:36 PM
To: TF-AIDN
Subject: Re: [TF-AIDN] Materials to consider and discuss for TF-AIDN Public comment release

Dear All,

I have had a chance to review the work I just shared with all of you.  I have the following comments.

We all agreed with the guidelines document I have circulated in Istanbul.  However, in my opinion both the Arabic language and Urdu language LGR we are now finalizing diverge from these guidelines.

Let me start with Urdu LGR, as I am part of the WG which has developed it.  We have discussed internally in the WG working but do not have consensus on the following variant principle: “For each language, variant analysis should be extended to include variant analysis across Arabic script.”  This implies that letters آ and ا or ی and ے should be variants in Urdu due to Arabic script level analysis.  However, within Urdu these are considered clearly distinct letters in current use.  Thus, making these variants has significant impact on the no. of labels which are created and also on the expectations of Urdu language community – possibly going against RFC 6912’s least astonishment principle.  However, if these code points are kept distinct, the issue is that when the LGR is integrated with other Arabic script based LGRs, e.g. for Arabic language, it creates instability as the Arabic language community would consider these as variants; even if not integrated, it still causes the confusion for Arabic script community not familiar with Urdu.

For Arabic language LGR, there are minor technical matters.  I loaded the XML in the LGR toolset and it gives error saying that some variants are out-of-repertoire.  For example, 06F0-06F9 are not listed in the repertoire but defined in the variant sets.  This is a minor point, that such cases need to be added in the repertoire as well.  In addition, there are also transitivity issues in the XML which need to be addressed.  Could you share a revised version of the XML?  (if one needs to define blocked variants, see the cross-script variant example in the Armenian script LGR for the root zone: https://www.icann.org/en/system/files/files/proposed-armenian-lgr-05nov15-en.xml[icann.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.icann.org_en_system_files_files_proposed-2Darmenian-2Dlgr-2D05nov15-2Den.xml&d=DwMGaQ&c=FmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM&r=KTETvEaGPwPcawI-QmNa-kiv-ZBvdgyyLm-mxd028M4&m=_UVlGKHcG0xlnNnyG2oaNtp6qLqrKeGlllZJwLjr6yM&s=jwzM9grWWKEyVgZnJ3VU8_FiMzO8k-Xx03TbvY0wztc&e=>).

Also, I would request the Arabic language WG to reconsider having many of the conditional variant rules to manage different types (in the Heh cases).  This makes the LGR very complex and is challenged by the RFC 6912’s simplicity principles.  Is that something which can be simplified (e.g. for Heh cases)?

Let’s meet this Wednesday to discuss these and any other issues raised.

Regards,
Sarmad


From: Sarmad Hussain
Sent: Sunday, April 16, 2017 2:50 PM
To: 'TF-AIDN' <tf-aidn at meswg.org<mailto:tf-aidn at meswg.org>>
Subject: Materials to consider and discuss for TF-AIDN Public comment release

Dear All,

Please find attached the following:


1.      Arabic language LGR for the second level

2.      Urdu language LGR for the second level

3.      Guidelines we finalized for the second level analysis

Kindly review and send in your comments.

Regards,
Sarmad

From: Sarmad Hussain
Sent: Wednesday, April 05, 2017 5:37 PM
To: TF-AIDN <tf-aidn at meswg.org<mailto:tf-aidn at meswg.org>>
Subject: RE: [Ext] [TF-AIDN] Next TF-AIDN Call | Wednesday 5 April 2017 | 11.00 - 12.00 UTC

Dear All,

We had a discussion on how to close the current threads of work an plan for next steps.

For the current work, the following was suggested:


•        UA report to be circulated in one week for any final comments and then published by TF-AIDN for public comment in two weeks.  Abdelmonem and Zied will follow up.

•        Arabic, Urdu and (possibly) Pashto language tables to be circulated for final comments in one week, and published by TF-AIDN for public comment in two weeks.  Sarmad will follow up.

For next steps, the following was suggested:


•        For UA, a list of activities and their priorities will be developed and finalized by TF-AIDN – which will determine the next steps for UA work.  This plan will be discussed after the current document has been published.  Abdelmomen volunteered for this, but others are welcome to join.  Please coordinate with Abdelmonem

•        For the language tables, work will start to draw a list of languages which are needed to eventually integrate them to develop a script level LGR for the second level.  The process will include also developing a criteria for this inclusion.  Finally, once the list has been developed, TF-AIDN members will be requested to help with the languages identified for developing the LGRs, in collaboration with the relevant community.

Others on the call, please feel free to add any further details discussed.

Regards,
Sarmad

From: tf-aidn-bounces at meswg.org<mailto:tf-aidn-bounces at meswg.org> [mailto:tf-aidn-bounces at meswg.org] On Behalf Of Fahd Batayneh
Sent: Wednesday, April 05, 2017 12:51 AM
To: TF-AIDN <tf-aidn at meswg.org<mailto:tf-aidn at meswg.org>>
Subject: [Ext] [TF-AIDN] Next TF-AIDN Call | Wednesday 5 April 2017 | 11.00 - 12.00 UTC

Friends and Colleagues,

This is a gentle reminder that the TF-AIDN will have a call on Wednesday 5 April 2017 at 11.00 – 12.00 UTC. I will shortly circulate a meeting invitation with the bridge details.

Thank you,

Fahd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.meswg.org/pipermail/tf-aidn/attachments/20170501/4d520ab8/attachment-0001.html>


More information about the TF-AIDN mailing list